A new python script scripts/download.py is added to fetch tarballs using
GitHub archive API [1], then repack in a reproducible way same as the
current DownloadMethod/git
GitHub imposes a 60 reqs/hour rate limit on unauthenticated API
access[2]. This affects fetching commit date for feeding tar --mtime=
argument. However, observation indicates that archive download is NOT
subject to this limit at the moment. In the rare cases where download
fails because of this, we will falback to using DownloadMethod/git
The missing piece in the GitHub API is that it cannot provide in the
tarball dependent submodules's source code. In that case, the
implementation will also fallback to using DownloadMethod/git
[1] Get archive link, https://developer.github.com/v3/repos/contents/#get-archive-link
[2] Rate limiting, https://developer.github.com/v3/#rate-limiting
v2 <- v1:
- allow passing multiple urls with --urls argument
- add commit ts cache. can be helpful on retry
Signed-off-by: Yousong Zhou <yszhou4tech@gmail.com>