• plateee@piefed.social
    link
    fedilink
    English
    arrow-up
    5
    ·
    1 month ago

    I’d be shocked if they weren’t feeding in old data. Anything can be training data if you’re desperate enough, including early 2000’s Myspace pages scraped by the way back machine.

    • knightly the Sneptaur@pawb.social
      link
      fedilink
      arrow-up
      2
      ·
      1 month ago

      The wayback machine is too lossy and would have missed most of my written corpus, I’m talking about finding me in the full Twitter firehose from prior to 2008 and accounting for the fact that my writing style shifts notably with new slang.