Friday, December 18, 2020

Stata Tip: How to Clean String Variables

I know, I know. It's been a while since you've heard from me. You know how some people have been super productive during the pandemic? I haven't been one of them. Oh well. The good news is that I survived teaching my first all-online semester. I submitted grades yesterday! To all of the students who attended the classes live this semester..and especially to those of you who put your camera on every now and then, thank you! To those of you couldn't, do stop by my office sometime when we're back on campus to say hello.  ;) 

And now, to get things restarted on this blog, I thought we'd go with a Stata tip on how to clean string variables. Sure, sometimes we are given nice, ready to go data. Other times, not so much. For example, imagine that sometimes "Delia Furtado" appears as "Delia furtado" or "delia   furtado" or "Delia%&Furtado".  What to do? See Verena Wiedemann's Stata tricks, posted on Oxford's Coders' Corner (h/t David Mackenzie's blog). 

Yes, you could try to find these "by hand," but the more you do automatically, the less likely you are to make mistakes!