Okay. Yes. This sounds like the dumbest question.

I just have a PDF that has a bunch of data in the ugliest possible way and I’ve attempted to pull it into an Excel sheet by myself and mess with the powershell to make it better, but it’s awful, I’m too novice, and I frankly don’t have the time.

This is a personal request, not actually condoned by my workplace, but the data wouldn’t be sensitive. So could I find someone on like fiverr? Or, I dunno where to even look honestly.

Any suggestions?

    • Suck_on_my_Presence@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      1 day ago

      Okay finally managed to grab a table or two - sorry for the delay. But yeah this is what they look like and I just need to consolidate this data and format it better haha

      • TheDarkQuark@lemmy.world
        link
        fedilink
        arrow-up
        2
        ·
        1 day ago

        Try using an LLM.

        I tried to timebox myself with a Python OCR solution, but ended up spending way more time than I should’ve, without ever arriving at a good solution.

        I then tried using Qwen 235B via Brave’s Leo AI, and got some okay results (I think you’ll have better luck with commercial AI models; I just don’t have an account with them atm)

        1000010905

        1000010906

      • AmidFuror@fedia.io
        link
        fedilink
        arrow-up
        1
        ·
        2 days ago

        The data may already be selectable text, but usually the formatting after copy paste is horrific.

    • Suck_on_my_Presence@lemmy.worldOP
      link
      fedilink
      arrow-up
      1
      ·
      2 days ago

      The PDF is a bunch of tables mostly, but it’s tables within tables. I honestly don’t think I care to try OCR into a CSV into python or anything.

      I honestly loathe these documents and reading them so I’m happy to just pay for the help