Skip Navigation

InitialsDiceBearhttps://github.com/dicebear/dicebearhttps://creativecommons.org/publicdomain/zero/1.0/„Initials” (https://github.com/dicebear/dicebear) by „DiceBear”, licensed under „CC0 1.0” (https://creativecommons.org/publicdomain/zero/1.0/)X
Posts
2
Comments
7
Joined
3 yr. ago

  • This is very concerning. DOJ has stated explicitly that any CSAM was removed before releasing the files. Should I remove the magnet link to the merged Data Set 9 torrent?

    I haven't looked inside any of these sets myself. My primary goal has been to get the DOJ data distributed.

  • Amazing, thank you. That was my thought, check hashes while merging the files to keep any copies that might have been modified by DOJ and discard duplicates even if the duplicates have different metadata, e.g. timestamps.

  • Thank you for this!

    I've added all magnet links for sets 1-8 to the original post. Magnet links for 9-11 match OP. Magnet link for 12 is different, but we've identified that there are at least two versions. DOJ removed files before the second version was downloaded. OP contains the early version of data set 12.

  • When merging versions of Data Set 9, is there any risk of loss with simply using rsync --checksum to dump all files into one directory?

  • datahoarder @lemmy.ml

  • Deleted

    Permanently Deleted

    Jump
  • Technology @lemmy.world

    Permanently Deleted

    example.com