Viewing 40 posts - 1 through 40 (of 63 total)
  • OCR help
  • Cougar
    Full Member

    I’ve got an image which I want to run through some OCR software. I’ve tried a few online tools now and they kinda work but are inaccurate. What I’m converting is code rather than real words (which might be contributing to throwing out the online converters) so it needs to be accurate. Short of me going through it and checking manually, has anyone got any recommendations for reliable OCR tools? Or even, has access to something a bit more robust than random website who would be able to process it for me?

    Cheers.

    thepurist
    Full Member

    If you can paste it into MS One Note that’ll do ocr, not tried it on code though.

    aracer
    Free Member

    For code I’d be inclined to just type it in. You’ll probably waste more time getting the OCR to work properly and fixing errors.

    Cougar
    Full Member

    If you can paste it into MS One Note that’ll do ocr, not tried it on code though.

    Ooh, that’s handy, I didn’t know that. Doesn’t work in this case though (it was considerably worse in fact).

    leffeboy
    Full Member

    I’ve never used anything that’s been fantastic or not required significant work after :(. Have you tried Google Drive yet? I would expect them to quite good really seeing as how good they are at finding random text in my photos

    Cougar
    Full Member

    For code I’d be inclined to just type it in.

    Bugger that. It’s not program code, it’s code code – it looks like it could be Base64 encoding. (The purpose of the exercise here is to work out what it is and then try to decode it.)

    Greybeard
    Free Member

    Is it a good sharp image, with good contrast, etc? If not, that’s probably one reason why OCR is struggling. If not, can you enhance the image? Some OCR (I think) will favour letters that make dictionary words, whereas your code will be random, so you may need a very basic OCR that doesn’t try to be clever. But I suspect any wrong characters at all will make the code unrecognisable, and 100% correct OCR is rare.

    Cougar
    Full Member

    Trying Google Docs, it’s been spinning for about 10 minutes so far.

    Is it a good sharp image, with good contrast, etc?

    The image isn’t bad, suppose it’s all relative.

    100% correct OCR is rare.

    Sure. I just hoped technology had moved on since I last tried anything like this.

    seosamh77
    Free Member

    Don’t fancy your chances. I’ve only ever used acrobat pro to ocr it’s never 100% can be decent though. Image quality/orientation/straightness/sharpness is usually a big factor. But even very high Res stuff can fire up loads of errors. I think some typefaces are probably easier that others.

    woffle
    Free Member

    I do a bunch of OCR processing on scanned pdfs at work – a mixture of type and handwriting – there’s a fair bit of short-hand in there (trading strategy codes ie. strikes, instruments etc). It utilised Tesseract-OCT and is pretty accurate – though it can struggle with some non-alphanumeric characters (ie. bracket types can be a PITA).

    Not sure if there are any online instances you can use though.

    geoffj
    Full Member

    Modern OCR algorithms tend to work on recognition and then comparison to a dictionary. Great for words, but not so great for random characters.

    Jamie
    Free Member

    Post it and let the nerds on here have a go.

    Cougar
    Full Member
    Jamie
    Free Member

    There are errors.

    qANQR1DBwU4D/TlT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy
    ZBIEFtjyqCd96qF38sp9IQiJIKUNaZfx2GLRWikPZwchUXxB+AA5+qsGELBvRa
    c9XefaYpbbAZ6Z6LkOG+eEOXASe7aEEPfcdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv
    z/9Ak4/0LniJRk?5/2UNE520a-31cvIT?mfGajv?hk?q?cav??Kiin3hv7+V?88
    uLLem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9AnuDGBYGDMkjiZIRI7azWnoU9
    3KCnmpR6DVO4rDRAS5uGl9fioSwze+q8XqxubaNsgdKkoD+tB/4u4c4tznlfw1L2
    YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd49BTOFFTWWavAfE
    gLYcPrcn4s3EriUgvL30zPR4P1chNu6sa3ZJkTBbriDoA3VpnqG3hxqfNy0lqAka
    mJJuQ530b9ThaFH8YcE/VqUFdw+bGtrAJ6NpjIxi/xOff0InhC/bBw7pDLXBFNaX
    HdlLQRPQdrannwskKzn0Sarxq4GjpRTQo4hpCRJJ5aU7tZ09HPTZXFG6iRITDwa47
    AR5nvkEKoIAj5HaDKiJriubLdtN40XecWvxFsjR32ebz76U8aLpAK87GZEyTzBx
    dV+H0hwyT/y1cZQ/E5USePP4oKWF4uqquPee10PeFMBo4CvuGyhZXD/18Ft/53Y
    WIebvdiCqs0oabK3jEfdGExce63zDIO=
    =MpRf

    It’s 85-90% there I think.

    geoffj
    Full Member

    qANQR1DBuU4D/TlT68XXuiUQCADf§2o4b4aFY8cHumA7hR1Hvz9rbv2BR6UbEUsy
    ZBIEFtjYqCd96qF38SP9IQiJIKlNaZfx2GLRHikPZHChUXXB+AA5+lqSG/ELBVRa
    c9Xef aYpbbAZ6z6LkOQ+eEOXASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv
    z/9Ak4/0LnLiJRkos/2unEszoa+3lcvlrnmfeajvRhkxqocavpoKiin3hv7+vx88
    uLLem2/fQHZhGcQvkqZVqXx8SmNu5gzuvujv1HHj9muDGBYOMkjiZIRI7azVnoU9
    3KCnmpR60vOArDRASSuGl9fioSvze+q8XqxubansgdKkoD+t8/AukcktznLfu1L2
    YBS+dzFDH5deSHFSo7JkecAS4N89jAU9K+f7PTASeSCBNETDd49BTOFFTUUavAfE
    gLycprcnLs3EriUgvL30zpRAp1 chnu6sa32JkTBbriDoA3vpnqG3hxqfnyolqAka
    MJJuQ530b9ThaFH8YCE/VqUFdH+bQtrAJ6Npjlxi/xOFfOlnhC/bBH7PDLX8FNaX
    HdlLQRPQdrmnUskKznOSarxq4GjpRTQo4hpCRJJSaU7tZ09HPTZXFG6iRITOwa47
    AR5nvkEKOIAjUSHaDKiJriUULdtN40XeCUVXFSjR32ebZ76U8aLPAK87GZEYTZBX
    dV+lHOhuyT/y1 c2Q/E5USePP4oKHF4uqquPee10PeFMBoACvuGyh2XD/18Ft/S3Y
    HlebvdicqsooabK3jEfdGExce63zDIo=
    =MpRf

    Errors here too 🙁

    seosamh77
    Free Member

    qANQR1DBwU4D/TlT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy
    ZBIEFtjyqCd96qF38sp9IQiJIKlNaZfx2GLRWikPZwchUXxB+AAS+lqsG/ELBvRa
    c9XefaYpbbAZ6z6LkOQ+eE0XASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv
    z/9Ak4/0lnliJRk05/2UNESZ0a+3lcviTMmfGajvRhkXqocavPOKiin3hv7+Vx88
    ullem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9muDGBY0MkjiZIRI7azWnoU9
    3KCnmpR60V04rDRAS5uGL9fioSvze+q8XqxubaNsgdKkoD+tB/4u4c4tznlfw1L2
    YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd498TOFFTWWavAfE
    gLYcPrcn4s3EriUgvl30zPR4P1chNu6sa32JkTBbriDoA3VpnqG3hxqfNyOlqAka
    mJJuQ530b9ThaFH8YcE /VqUFdw+bQtrAJ6Npjlxi/x0FfOinhC/b6w7pDLXBFNaX
    HdlLQRPQdrmnWskKznOSarxq4GjpRTQo4hpCRJJSaU7tZ09HPTZXFG6iRIT0wa47
    ARSnvkEKolAjWSHaDKiJri uWLdtN40XecWvxfsjR32ebz76U8alpAK87GZEyTzBx
    dV+lH0hwyT/y1cZQ/ESUSePP4oKWF4uqquPee10PeFMBo4CvuGyhZXD/18Ft/53Y
    WlebvdiCqsOoabK3jEf dGExce63zDI0=
    =MpRf

    no idea if there’s errors, i’ll let you check! acrobat. probably sees the likes of those lower case “el’s” as upper case “eye’s” so need to be careful of things like that.

    Cougar
    Full Member

    Yeah, I did it with errors too, hence the question in the first place.

    Cougar
    Full Member

    That Acrobat one looks to be the closest yet. In fact, at a glance I think the only errors are extra spaces.

    I’d need to check it manually to be sure, but that’s a great start. Thanks.

    seosamh77
    Free Member

    check my edited comment, need to keep an eye on that and other such possibilities, zeros and o’s for example. etc..

    Cougar
    Full Member

    Aye, ta. It doesn’t decode into anything sensible so that might be the case.

    seosamh77
    Free Member

    no worries, good luck!

    Cougar
    Full Member

    Just an update to this if anyone cares.

    Someone else typed it in manually. I compared it electronically with Joe’s OCR effort and highlighted differences, which made it easy to check mistakes. I believe I now have an accurate copy – if it’s wrong, then we’ve both got it wrong in an identical manner.

    Thanks again. (Now I just need to try and decode the bugger.)

    seosamh77
    Free Member

    how many differences did that comparison pop up?

    seosamh77
    Free Member

    And another question just out of sheer curiosity, what would you expect that amount of code to decode to? just the same amount of legible text, alot more/ less?

    I know if you get code like that for an image there’s always millions of it. so i wouldn’t expect there’s much actual info behind that code?

    Cougar
    Full Member

    how many differences did that comparison pop up?

    There was one mistake in the manual transcription (next to last line, can’t imagine why…!) and four I think in the OCR (not including the two rogue spaces I’d already spotted).

    And another question just out of sheer curiosity, what would you expect that amount of code to decode to? just the same amount of legible text, alot more/ less?

    I thought it was Base64, which would return a block of text half the size of the original. But it’d seem I’m barking up completely the wrong tree.

    The thing is a puzzle that’s been posted on a Facebook group (for “escape room enthusiasts”) with no preamble or context other than “what do you think?” It could be a bloody Magic Eye picture for all I know.

    Cougar
    Full Member

    Incidentally, here’s the corrected output, if anyone is feeling inspired.

    qANQR1DBwU4D/TLT68XXuiUQCADfj2o4b4aFYBcWumA7hR1Wvz9rbv2BR6WbEUsy
    ZBIEFtjyqCd96qF38sp9IQiJIKlNaZfx2GLRWikPZwchUXxB+AA5+lqsG/ELBvRa
    c9XefaYpbbAZ6z6LkOQ+eE0XASe7aEEPfdxvZZT37dVyiyxuBBRYNLN8Bphdr2zv
    z/9Ak4/OLnLiJRk05/2UNE5Z0a+3lcvITMmfGajvRhkXqocavPOKiin3hv7+Vx88
    uLLem2/fQHZhGcQvkqZVqXx8SmNw5gzuvwjV1WHj9muDGBY0MkjiZIRI7azWnoU9
    3KCnmpR60VO4rDRAS5uGl9fioSvze+q8XqxubaNsgdKkoD+tB/4u4c4tznLfw1L2
    YBS+dzFDw5desMFSo7JkecAS4NB9jAu9K+f7PTAsesCBNETDd49BTOFFTWWavAfE
    gLYcPrcn4s3EriUgvL3OzPR4P1chNu6sa3ZJkTBbriDoA3VpnqG3hxqfNyOlqAka
    mJJuQ53Ob9ThaFH8YcE/VqUFdw+bQtrAJ6NpjIxi/x0FfOInhC/bBw7pDLXBFNaX
    HdlLQRPQdrmnWskKznOSarxq4GjpRTQo4hpCRJJ5aU7tZO9HPTZXFG6iRIT0wa47
    AR5nvkEKoIAjW5HaDKiJriuWLdtN4OXecWvxFsjR32ebz76U8aLpAK87GZEyTzBx
    dV+lH0hwyT/y1cZQ/E5USePP4oKWF4uqquPee1OPeFMBo4CvuGyhZXD/18Ft/53Y
    WIebvdiCqsOoabK3jEfdGExce63zDI0=
    =MpRf
    seosamh77
    Free Member

    cool, 4 plus spaces is not bad at all, better than I usually get from text I’ve ever needed to OCR, good stuff.

    Sounds interesting, absolutely no use to you there, I know nothing about code. I did one of those escape room things once, was alright, I found it quite hard though. I unwittedly cheated my way out the cells they locked us in through brute force, I thought that was how we were ment to get out, woman came in absolutely puzzled, how’d you get out there? 😆 I asked her if that was the fastest anyone had every got out the cells!

    Cougar
    Full Member

    As I said, I’ve no idea whether it’s actually code, it just looked like it.

    I’m slightly addicted to escape rooms, I did my 13th one on Friday with my minions apprentices from work. They did well.

    I’ve got an opposite “jail-break” tale. I did one where you start all handcuffed together. You’re supposed to get the key fairly early on in the proceedings but we missed something obvious so completed the entire room chained up to each other! Worse, the GM was inattentive and didn’t notice we’d got out, so we were wandering around the corridors and reception still chained together…

    seosamh77
    Free Member

    😆

    I was wondering, because it’s been posted as an image, that would maybe possibly suggest that it isn’t ment to decode digitally, so possibly some visual decoding you can do on it to find a message? nothing pops out though, i don’t really see any obvious patterns.

    Cougar
    Full Member

    I’d agree. But I couldn’t see anything of use either, which is why I resorted to this tactic.

    Oh, how did you ‘brute force’ the room, incidentally?

    fifeandy
    Free Member

    5 mins with google will tell you A) what it is, and B) what it says.
    This may or may not be cheating.

    Hint: it’s encrypted

    Cougar
    Full Member

    Cheers.

    Like I said earlier, it looks like Base64 encryption to me. But it decrypts to gibberish so it presumably must be something else.

    seosamh77
    Free Member

    Oh, how did you ‘brute force’ the room, incidentally?

    They lead us into the room blindfolded, put us into 3 cells. I was looking about everywhere to get out, was actually just going to climb out at first, but I thought that unlikely to expect people to do that. Next I was looking at the electricity box high up, as I saw the cells were magnetised(just as well I didn’t start on that, I nearly did! ), but then I spotted this half sawn plumbing pipe on the wall, i looked through it and saw a button at the other end, the rawl plugs on the wall holding the thing in where a bit wonky so I assumed you just had to shove this thing and press the button. bingo, all 3 cells opened, woman came running in! 😆

    Apparently we were ment to do some sort of trick with 3 ropes and a basket and get a pole of something from somewhere, never quite got it exactly, but it involved all of us working together. Overall it was alright though, good laugh, was only a small part of the game, so it never spoiled it. We eventually got out with a few minutes to spare. was close though, if I hadn’t done what I did I doubt we’d have made it in time.

    Cougar
    Full Member

    Hah! I’ve done that room, I know exactly what you’re describing. I was actually in the same cell as you when we played.

    seosamh77
    Free Member

    I never even registered the basket and ropes!

    Cougar
    Full Member

    😆

    Which venue was it at? They’ve got a few locations.

    seosamh77
    Free Member

    In glasgow, just behind paisley road west, in some auld industrial unit. Think they’ve got other rooms in the city centre aswell.

    I don’t think they anticipated my kinda lateral thinking! 😆

    leffeboy
    Full Member

    That sounds more fun than the Base64 thing :), need to see if we have any here

    seosamh77
    Free Member

    It’s worth a go aye, don’t think I’d gor for drac level 13 times mind you 😆 , but aye, I’d try another room.

    IvanDobski
    Free Member

    Could it be base64 but with the final 4 characters acting as some kind of key/offset?

Viewing 40 posts - 1 through 40 (of 63 total)

The topic ‘OCR help’ is closed to new replies.