The Online Genealogist

John Brugliera

Archive for the tag “OCR”

Got Brooklyn ancestors? a.k.a. Wanted: Eagle-eyed genealogists


If you have any Brooklyn residents in your family history, you’ll surely want to check out this valuable resource… the Brooklyn Daily Eagle.

With enormous thanks to the Brooklyn Public Library, we have free online access to much of the Brooklyn Daily Eagle from 1841 to 1955.  Fully searchable and browsable, this newspaper is chock full of great information of Brooklyn and her colorful residents throughout the years during New York City’s largest period of growth.

While the Eagle has the usual stuff – obituaries, marriage announcements, probate and other legal notices – there are extra goodies to be found within this cool little neighborhood paper.

How about sale of property due to unpaid taxes?


These listings go on for PAGES!  Check out any Wednesday Eagle in 1860.  If it’s more than four pages, then you’ve got several MORE pages of Brooklyn properties, with owners and locations.  Look at all the names here!!


And this is only ONE PAGE out of EIGHT!

In looking for your Brooklynite, start off with the OCR search.  Though don’t fully rely on that here; especially with all of the dots and numbers to confuse that iffy software.


So, who’s got an Alden S. Crowell or Hugh G. Crosine in their line?

Also, in the Things You Wouldn’t Usually Expect in a Newspaper category, the BDE sporadically published a…


This column for idle Brooklyn mail is broken down into a Ladies’ List and a Gentlemen’s List.  This is particularly noteworthy as there are so few resources in the mid-1800s that actually list HUNDREDS of women’s and wives’ names.  Look at those ladies’ names going all the way down to the center of the page!  Genealogy GOLD.


Being on this list would mean you’ve moved, you’re in jail, you’ve gone underground, you’re dead or you’re just plain LAZY.  No matter, it’s proof of residence – at one time or another.  Well, if that actually IS your Della Hall listed above there…

I barely scratched the surface in this post, but can confidently say that the Brooklyn Daily Eagle is an excellent neighborhood newspaper which should be at the TOP of your research list for all things Brooklyn.

Unfortunately, I don’t have any Brooklyn family.  But it doesn’t mean I can’t enjoy electronically flipping through the pages of the Eagle; with over 100 years to choose from!

Whether it be in Brooklyn, NY or Brookline, MA,  I will research your family history – I’m The Online Genealogist!



How many genealogy pay sites does one really need to subscribe to?


I wonder if any fellow researcher has determined how much it would cost per year to subscribe to ALL of the major annual-payment genealogy websites.  What do you think that dollar amount would tally up to?

Just off the top of my head, I’d say $1,000 would be a good ballpark figure.  Of course, only the (wealthy!) genealogist who needs access to EVERYTHING would ever dole out that $1k each year.  So, we can probably agree that subscribing to ALL of them is not necessary.

So, how many and which ones should you cough up the dough for?


Not to sound all wishy-washy, but it depends on YOU and what kind of family research you are doing.  Some of us are perfectly content in sticking with the multitude of free websites available, but others (such as myself) do realize that the information we glean from the pay sites is WELL worth the cost of admission.

Let’s take me, for example.  To me, subscribing to is a no-brainer.  And I’m just talking the U.S. Discovery package here.  I was a World Explorer once, but with going pro (again), I really only needed the States stuff – which is sufficient.  The breadth and scope of Ancestry’s domestic offerings are just what the doctor ordered for researching successfully for myself as well as others.

When it comes to military records, Fold3 is tops in my book.  They’ve got everything from enlistment records to actual pension FILES; and everything in between!  And now, under the Ancestry umbrella, military searches on there may bring up results linking directly to Fold3.  Pretty slick, I say!


As my research specialty is New England, NEHGS’s American Ancestors was another must-have.  They’ve got the Barbour Collection (CT vitals), The Great Migration Begins 1620-1633 (earliest immigrants) and their NEHGS Register, with Volume One dating all the way back to 1847.  Yes, there was genealogy back then.  The major selling point for me, though, was the ability to access Deaths Reported in the Boston Recorder and Telegraph, 1827 & 1828!  <–Joke.  And a bad one at that.

Is that it?  Of course not!  Just today, I decided to sign up for WorldVitalRecords and GenealogyBank.  Both offer trial periods (free and not), and I’ve had them on my to-check-out list for a few months now.



Why these two?  Well, WVR because of their world vital records (duh) and Everton’s Genealogical Helper, an old favorite that I just enjoy flipping through.  For you young folk, it was THE genealogy magazine, before this whole crazy interweb thing.  Yes, a magazine.  Kind of like a book, but more flexible and chrono-relevant. 

GenealogyBank has newspapers, newspapers and MORE newspapers.  But again, this was after finding that it had the best selection of New England newspapers, compared to all the other guys., and MORE newspaper-prefixed dot-coms.  GB also appears to have top-of-the-line OCR (Occasionally Correct Reader) software.  I was very impressed on a few of the items found, given the original papers’ condition and film quality.

Whenever I’m checking out any potential pay site, the very first thing I do is enter my name in the “free search” box.  No first name; only the last.  It’s uncommon enough so that I can tell what they have by what results come up for it.  You may want to try one of your obscure family names to get the same idea.

Speaking of free searches, I use Mocavo only for the search results and then find the links on my own.  As it’s a Google for genealogy, most things can be found easily enough once you know what they are.  If that makes any sense.  Sorry, Michael.

Bottom line: try before you buy.  LOOK at what records they actually HAVE, which I know can be difficult to do with some of them at times.  Do a few of those oddball searches, and if you go “Oooooooooooooooooooooo!” upon seeing the results; you’ve got your answer (heh).

And going back to the aforementioned Genealogical Helper, here’s a page from 25 years ago!


Pretty scary, eh?


And if YOU’RE looking “for a CHEAP estimate”, contact ME… The Online Genealogist!!!  Replace “Brockton, MA” with “West Lebanon, NH” and “Southeastern Massachusetts” with “New England, New York and Eastern Townships (early Quebec)”, and we’re there!


Nothin’ like a few million FREE IMAGES to spice up your family story!

Flickr - IA

The Internet Archive folks recently posted over 2.5 million IMAGES onto the photo-sharing website, Flickr.  Extracted from thousands of books originally searchable by text ONLY at Internet Archive.  Now the images can be searched on Flickr!

Why is this a big deal for genealogists?  We get perty pictures to go with our family histories!  Even though they are mostly “old” images past copyright, you’ll  surely discover a visual gem or two to accompany your ancestors’ stories.

Whether it be something specific like a photo of a long-gone family homestead or generic such as a period steam liner used to illustrate an immigrant family’s trans-Atlantic journey – it’s probably in there.  Remember, we’re talking over two-and-a-half million images here!

So, if you had MacLarens in Windsor, Ontario around 1900, they may have been “manufacturing” cheese…


Or perhaps some of your family lived near Chicago’s Garfield Park.  Here’s a close-up of that area from 1921.  There are several other neighborhoods available for viewing/downloading!


Maybe you’re the 3rd-great-grandchild of Dr. P. Edward Seguin, who set up practice in Royalton, Minnesota.  Do a Flickr search for him now and his photo comes right up!  He’s the one with the facial hair (heh).

Seguin 01

Then you hit a link and the original book is shown in its entirety; you’ll see the image in context and maybe find a few more words to go with your Man of 10,000 Lakes.

Seguin 02

Nice stash there, guy.  Oh, and check out his goateed colleague, George Allen Love, M.D. — Dr. Love!  (And yes, I love stuff “finding me” like this.)  Time to break out some Kiss!…

And while most early records aren’t OCR-friendly, they are definitely considered to be images.  Such as the below Allen County, Indiana Circuit Court Index from 1824.  (Hi ACPL!)  All images are downloadable, with Flickr’s excellent choices ranging from thumbnail to original.  I always grab the original, then re-size that as needed.

Allen 02

You can also download several stock photo-type items without the worry of being busted by the copyright police!  Like this large uppercase “C” for your the background of your Carlson Family homepage.


Anyway, you get the idea.  That is to NOT overlook this incredible Flickr/Internet Archive e-collection while gathering all sorts of images for your family story.

Then there’s this one image we will ALL use when we finish our family histories and they’re complete.

Adam & Eve


Oh, and PhotoShop, etc. can also straighten images to make them even PERTIER!


And if YOU think that your ancestry can be traced all the way back to Adam & Eve, do NOT hire me — the Online Genealogist!!



It’s a Mocavo Two-fer!


First, big news that Mocavo has been purchased by FindMyPast.

Here is the full announcement on their home page.

I’m almost ashamed to say that I’m not subscribed to Mocavo or FindMyPastAlmost.  I mean seriously, how many of these paid-subscription websites must a genealogist cough up the bucks for?  And with some of them, you do some serious coughing!  But that’s for another post…

Funny that those two were next on my “genealogy sites to subscribe to” list, but only if I really really really really needed to.  As Mocavo is more of a search engine, I’ve been able to locate the information on my own, after their Free Forever search comes back with the results.  Same for the newspapers on FindMyPast – many are already available online for free.

So, what does this marriage mean for us?  A bad name like FindMo’Cavo??

Well, to start, a combined website/yearly subscription would be nice!  *COUGH COUGH*  (The till is dry.)

I’m sure FindMyPast and Mocavo joined forces for the very reason I’ve yet to subscribe to either; they really don’t have enough exclusive material to warrant the extra expense.  It’s almost like they’re trying to snatch up the “scraps” that Ancestry and FamilySearch (and to an extent, Fold3) don’t want.

FMP/M will have plenty to say in these coming months.  But will their combined efforts be enough to get me off the fence?

And did you know Mocavo will scan your genealogy-related books, diaries, photos, etc. for FREE?  (Love that word.  FREEEEEE.)

Simply click that Contribute button on their navigation bar.  (Or you could always just click the clickable “Contribute” I made right there.)

I recently found two local town landowners’ annual reports at a church rummage sale and mailed them to Mocavo for them to scan.  Upon doing so, they’ll add these to their collection of OCR searchable items for ALL!  It’s a GREAT service and I’m hoping that many subscribers (or not) will take them up on this offer.  So, keep an eye out, as there’s lots of genealogical stuff out there for scanning!

One very nice Mocavo niche is the central availability of such annual town reports, many of which contain births, marriages and deaths recorded during that past year.  (Obviously, better chances of seeing those for smaller towns.  Cities will simply give you the grand totals.)  Though again, with some digging, you can find most of these annuals online elsewhere..,yes, for free.

But you probably don’t want to send Mocavo anything that’s near and dear to your heart.  Especially books, as they say they need to remove the binding for better scanning, which makes perfect sense.  Read the fine print.

I’ll let you know when “my” town records come up online there.  (Supposedly, they’ll contact me AND give me credit for the data.)

So, let this be an open challenge to FindMyPast/Mocavo

Knock me off the fence!!!


T-O-G Biz 01

OCR = Occasionally Correct Reader

When you search within scanned newspapers or books online, the results are usually derived by optical character recognition software, more familiarly known as OCR.  The idea is for the software to “read” the printed characters and “translate” them into readable/searchable words and sentences.  Unfortunately, that simple-sounding task isn’t so given the inconsistent quality of the printed and scanned material – especially when it comes to older newspapers.

Here’s more info than you’ll ever need on OCR…

And a great example of how newspaper print can “confuse” the software…


Yes, very iffy and not fully reliable.  “At” comes up as “la”, “the” is “che” and “good” here is read as “gobd”. But then it actually reads “lovers”, which looks more like “lpvera” in the copy.  As you can see, OCR resluts can be extremely sporadic.

When words are hyphenated, OCR gets really confused, which can be a source of “amuse*ment” for some, but frustration for most.

How well do you think OCR can read this obituary page from The Sun (New York City)?


Maybe it will properly capture that bold “Mary” about halfway down, but the rest of it?  Well, let’s see, shall we?

There are several free OCR readers online; just upload your image and see it magically transformed into perfectly readable text!

Uhhhhhhhhhhhhhh, maybe not…

||;1. – I’u.z l.. ;\..-»-. f.;,,| , ;»,
“”~ ‘1 “8 K ‘–‘ 1. 3. “‘1 -ll IP12
’14‘ :|..n !~»_’: 4. ‘-. – ‘ .‘ . ‘H “°u~
“I 1;? N_lv»\TNS0¢ Y‘ … in ~-‘ bl.‘ 4 u-I Tm ,’-I :1 , J01; I’, pl 1′ P,
-» “‘|W.‘- 9’ P‘ ‘1 H” – ~ ‘~ u “I, or \‘4I|-0“ 0 , 1I“\
“””‘*”‘ ” M – . W ‘  1. 4) -‘,‘-x~-A-.. 0’\O’ luau
V‘ ‘6-‘ “. ‘I “W I ” 5 \o.:”‘ ‘film llgl. ‘ ll ‘1 l|\ .1 |f0ug;¢
I’M.‘ -0.”. L‘ -0″‘ , 44 Q uflh-,-fp 4|. ‘nu ‘*0 1| \..;§’ ‘ .’f.».
: rt E ‘u“ “UV 01.0”] ’00 1.1‘   ‘In fl,§|| “||‘h’n l|.0’Jf‘¢
|_|’ 1′ K —\”\ l’r ol’\\’ 1.!” , “l.| ‘ IN»-‘I| H;-H K ¢|v|||,g||.
“r;.\-V | , » .!l . ‘ ‘\, I‘, I o_ .‘ .  .\.o,‘f-,.0 |~‘u“lf,“‘_“|‘
‘ 0|-I110,‘ ||”\’;-0. ‘ad Yf,v9¢I’ .|’ O7″. “9u”.‘v. “.n |…\ ‘um.
u ‘K’.’ ‘ ‘1? -‘ x f‘ .  ” *‘?”‘ ‘\ “ -‘“ ‘ *–‘””*|
. ..“‘||’H\‘ .\\~ |¢0,,‘¢‘“’ ” L.__ ‘Q f qg.‘ u|”.o,,| ‘“|’~ P”… ..
‘ Tm. ‘l.’::r |-1 ‘§l,.‘w” WW” ““‘ ‘ “””°”-“lam N~’ml¢!-
“‘|”l-;~:;§\‘- _’_i1;! I.» H. rm |>-1-\.g.y,’.MI|. O, ION, Am“;
Th H1: rn \;= :1 V? “”1 H vw ‘ ~w.v’
‘ \ .. , ‘ ‘ . ‘ ‘ .0 0 \ O
M ‘xi H) W W,“ Jar’ _4‘. _: 1:. -(‘$10! rntchluc NI
x  \’\‘.‘l.,’:’c t | ‘H1 \ .‘.\. _.l_“‘_‘  .”0“”|.Q “0”
|:\tn|’|:..1vh~’\.uh” “T. .3 ‘hnurw M‘. “”, L‘
‘ – – ~ “ll” — 1 ad uln-
‘ f~’.~*f’||.\\.:}_“~-4 !.!”»’I»~w: ‘.~’ _H’: l»\uvu_\ .m- rm-|m~Iln;|y
-b ’ I-I n’t‘H\‘| 0″. “U ‘9‘! ‘8. ‘“w\ ‘mM..m..”‘ M.’
:»|r~”;’|:~-Q-|’;v.’.|n|: ‘1 LVN” .l:’\l. I ‘H’. N“ “m“m““”
s é 1 ‘, ‘ ‘
:‘|.\H’;:|\ – “ll Ihu”°»H\ -‘1” 1.1%‘, Huh!“ “.§|H\Ih
Q :~’:r:rI -u|)\1u\t‘I~bl1. \ –mm luau. Inland. llvd ll
| “u”_””fff!;” I_’=’l {‘4 I1-if fir» rmvct!ull{ lam.-Mo
bu?‘ “ f 1-‘.|.a~._§_‘._. Rf!“.I§:-“rr‘1al’°fl\°o-‘:0! uh r-mmlg,
; “NJ” M.‘ .’ ” ‘; Q. IMP |Hu.~ “0.-I LT: r.!I;.t’|’l.:a.o.:l.|.f¢?~.|‘Us’:
y ‘\’\”:”3 a’~u7″‘|’ “”£1r1′;”‘””‘»’::“;~?’;”:|”‘“:”;.”§’
‘ A 0. Q ‘I |-‘Q . * \ .  5“ .  . ~. .
‘ ‘MS: ; ‘rE._’¢“. -‘| HI‘o’r|’.-Hf.‘-“1′.\.m bu c”
i \’“|’|»” \\|\ In; \..;ln’ IA! I-n I nfln l\-nL_-l..-

Wow, it didn’t even pick up “Mary” as I thought.  The only word I see is “Inland”, which is supposed to be “Ireland”.  (Funny that WordPress itself translates some of the text into smileys!)

Let’s try another image.  It’s darker and crisper, so the results should be better than the above.


I highlighted two couples — the thinking is that the top pair will be read better than the bottom.

IA It III”. –
IIIQWK-onllllzl-l..-In , Wm; 9
no-I».l at st. A|}n’n on the |\’.-an-. LV”.I\e- ll»-v..’8?nl‘
mu firhnurl. I.II.. rmvfin OHQM Iv-own us-In it
£_li:’:0:’ou“ .¢::’,:nvt 0! JOUII I. lid NM D10 null .
H ‘ , , I ‘ .
,,.’*.3″.::*::’>:;.i”:.’;’¢..:’:.‘!”;’-.» n~.r.”.”-:a’t:-
IL I ‘ 0 ‘ ‘ ~ O ‘ I Q
no-1|-~r|» .\. I Mm! M Um 1-an to Marv Jo‘. cl mgmur of
{Iv I:-tr J. lhnord 510001001! Nllllvlllj‘. Him.
. ‘ ‘~‘.“’.‘3’»“ Y I i|‘»’2i=§’2‘-“i’8#’; ‘6’?» °’.~2!2..%‘.’.‘€.‘
 0″ |, ‘ N‘ :\’ – ‘ . P 0 0 ‘ O
‘ 1′ I.a‘tt ‘oi  1;. 5′ ~’.‘\u‘3illI B-. dllfllw-I’ 0| U00 UM
‘ ,1 ~‘ ° | ‘ U 0 I. .
| “1§|||’l‘|\’.-s – iwrnnvnn -In Joohonvllln. Hm. ol
.\Iua0n‘\’. Jan. M ny W. N. l|\\r3m%Ju0ucc- e.l we Puke.
–.’.» :;~.’ £ :’.’2″‘<:.’=”‘~.: m-2:. ~ “”‘.’;:-“‘.:~.,……
‘ § . ‘“ 0 “
Jan. :1, II |;mlwr|w|Pl’n|-K. ilk; I§tbe IlrV._ L‘:
llamlltmml. “’o|l¢’I’ Imug|;m of ‘WI ufk In (.lI’fI”‘
daugmer of ca» nu» Iv. s nude Inn or llw II-c
-I .
ul’ll.l. J \f‘IlSn\i (m Tm-ndny. P0701, by Oh! RN‘
mu. rm»… |~. 1» , \\m um to Mu-e, ammo: oi
w(“|’vi’l1′?‘*:i’v”‘ M!|””‘ ‘i’r’|”|\r,<‘»?» “.1950 Wvdnflbl
, , \ . . ,
r\’mln~’. I» » n. I. an nw nrmlnuci umhv lm¢IQ’0 (vii;
Mr \ Ilmulu” Hhmmlnl annmgn Flvrllqn. ‘ ..l0
Mn» .\~lq|~: l.. lfznm, wlnnlfierof Ir. Wm. la. ll nu.

But then again, maybe not.  No wonder this OCR is FREE.  (Yes, some OCR is better than others.)

If it can’t read THIS image, then this OCR is basically USELESS…


With regard to online newspaper image quality, this is about as good as it gets.  The OCR results…

luflsnl-BEBKIRBAIYOII.–On T0009“. his 15
at Cmlla Church. b tho llwv. Ila I Iv . hal-
Iof. htolnll ;. Ill to OIIMI cl! 00.3″: gt!!!»
W I –
Damn Print“! 1‘ umuiwlnl .l..
vi .na~ A RIB.-On “mum. Jun 10. at we
“nee of gm bride‘ rum. by uvkv. Mr. ‘nu,
r. lllmn oven 03 gflllllllkg. 1% Mm mu
“DER, magma: 0 . . an. .. ol claim
LA IZY.—In the my ol Rex‘ York. Ml QM 101,?
Q I | 0 b’ 0 J
::::§*‘:.’%:.z.::'”,’.:a: .2’l..’.:., ..’:r:’u…”:’m.*.,’.
lIIU.—0| J I
€:%:c|’r:ac,nuct or gear) in
n mmnv. one 81.01 crazck I’. ll. lhlouvon 0|‘
Q an rrgccllulznot to uugmt.
…~.:<:>;.w.~-::…:.; “-2.1% .~.::~”- “–‘–
0 nl0tlvr‘_na¢: 1110010.” the $inu|’y. and than 0|
In non. Jun lhmou. on rnloootlnuy mange! co nt-
an In Marni Iron: Mr mo mudnoce. 810 cot IMO
at on goo-lay. Juno Ital, I’. I.
I10! \l0. 0 anmlay. Mm tn. Cnherlnt In-Ion!
1).‘: fncboluqnrnm. to the lmh {far M er cm.
R ouvn and rlentotuc um lumd I
0¢uod*no hum] Irom r Iota Fianna‘. III Mount
IL, II u1’ncQ‘. inn II. it 90′: net A .,0la0nn- 0|
1‘; .{“.’!’.’_‘! L_’!¢P;_¢”t’!’-“.9!.L’¢_9.t1.wLlMm~ rm

Wow, is that bad or WHAT?!?

Going back to the original example of the bad OCR, can it even read the RIGHT SIDE correctly??

‘nu ‘h:u?””?-.“’ ‘ ‘ uh I Family Theatre.ThU theatre wli open to-day withan
1 U48 ‘ . . . .
u “Mn ch…“ u H” n‘ I. ‘uh ent|re change of b|Il at Its mat|nee.The bull la of gobd
,¢g_ ‘pp 5‘ |i_ Q 3*] ygflgg “Q variety andwill surely please all lovers of amuse”ment.
V.” WNW 9’91 I” |F””@ 3.3.‘ One of che feature is’ MissMinerva Vano. \ h e queen
nun; ‘Om- at am fawn U’ lb .
‘mar’ ‘an.’ “t Q…‘ 0‘ “. ~‘.‘. of the handcuffs.We copy the follow|ng from
‘ml; we can the loltou-In hon
|’|’||Q flung.‘ |4»“¢;_ Rig Ilgqgg, The Evening Leader. New Haven,
|¢oan: .

Better, but still unusable.

The bottom line here is that OCR is NOT at the point where it’s fully or even partially reliable in most cases, ESPECIALLY when it comes to newspapers.  Should you ditch the OCR searches altogether?  No.  But just realize that if you don’t get the expected results, it does NOT mean that what you’re looking for is NOT within that book or newspaper.  You’ll just need to browse the pages on your own, using your OWN built-in OCR.  And as shown above, even THAT can be difficult given the varying quality of original printed matter.


And if you’re looking for a human OCR, I’m your man!




Post Navigation