-
Notifications
You must be signed in to change notification settings - Fork 3
feat: add source-merge subcommand #6
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
|
Which name do you propose instead for the command? |
96356e5 to
6921220
Compare
|
31b1f44 to
88886bf
Compare
88886bf to
58d1c4d
Compare
On repeated executions some files already have been downloaded. Currently these are already skipped when downloading again, but they show up in the statistics. We now split the statistics into the total data and the data we already have. This gives a better overview what still is missing and needs to be downloaded. We further change the return type to a named tuple to make it easier for downstream users to access the various values. Signed-off-by: Felix Moessbauer <[email protected]>
All debian source packages come with a .dsc file that provides the links to the other files (e.g. .orig.tar or .debian.tar), along with some other information. As a safety measure, we check on downloading if every source package has this file. Signed-off-by: Felix Moessbauer <[email protected]>
Signed-off-by: Felix Moessbauer <[email protected]>
Signed-off-by: Felix Moessbauer <[email protected]>
A source package consist of multiple individual parts (e.g. policy 4.x diff files, .orig and .debian tarballs). To create a single artifact which can be used for license clearing, we need to merge these into a single archive. For that, we introduce the SourceArchiveMerger class which performs the merge based on the .dsc data. Signed-off-by: Felix Moessbauer <[email protected]>
We add the merge subcommand which runs against a download directory and creates combined archive files, which can be used as input to license clearing tools that only support a single archive per component. Signed-off-by: Felix Moessbauer <[email protected]>
We add a unit test that checks the various debian formats, as well as compressing the output tar with all supported compressors. Signed-off-by: Felix Moessbauer <[email protected]>
The mirror might have the same file under various names and paths. Currently, we only return the first instance, but this is not sufficient if other files link to a filename that is not a first instance. We now expand this list and return all instances. Signed-off-by: Felix Moessbauer <[email protected]>
As the snapshot client now returns all file instances, we also download them multiple times. To optimize this, we check if we already have a file with that hash and just link it. Signed-off-by: Felix Moessbauer <[email protected]>
58d1c4d to
3ff44a8
Compare
We add the merge subcommand which runs against a download directory and creates combined archive files, which can be used as input to license clearing tools that only support a single archive per component.