<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0"
	xmlns:content="http://purl.org/rss/1.0/modules/content/"
	xmlns:wfw="http://wellformedweb.org/CommentAPI/"
	xmlns:dc="http://purl.org/dc/elements/1.1/"
	xmlns:atom="http://www.w3.org/2005/Atom"
	xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
	xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
	>

<channel>
	<title>Analytic Economics - Blog</title>
	<atom:link href="http://analyticecon.com/blog/?feed=rss2" rel="self" type="application/rss+xml" />
	<link>http://analyticecon.com/blog</link>
	<description></description>
	<lastBuildDate>Fri, 30 Dec 2011 19:25:59 +0000</lastBuildDate>
	<language>en</language>
	<sy:updatePeriod>hourly</sy:updatePeriod>
	<sy:updateFrequency>1</sy:updateFrequency>
	<generator>http://wordpress.org/?v=3.2.1</generator>
		<item>
		<title>What to Look for in a Data Scientist</title>
		<link>http://analyticecon.com/blog/?p=5</link>
		<comments>http://analyticecon.com/blog/?p=5#comments</comments>
		<pubDate>Fri, 30 Dec 2011 19:24:26 +0000</pubDate>
		<dc:creator>Mike Kimel</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://analyticecon.com/blog/?p=5</guid>
		<description><![CDATA[I don&#8217;t know if I would characterize myself as a &#8220;data scientist.&#8221; That said, I have done a lot of analytics work in my career and have designed a few statistical algorithms which has made me, at least at times, a fellow-traveler to those who call themselves data scientists. As a result, I was quite interested in this post by Cathy O&#8217;Neil about what she does in her role as a data scientist. She lists four points. The first two &#8230; <a href="http://analyticecon.com/blog/?p=5">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>I don&#8217;t know if I would characterize myself as a &#8220;data scientist.&#8221;  That said, I have done a lot of analytics work in my career and have designed a few statistical algorithms which has made me, at least at times, a fellow-traveler to those who call themselves data scientists.</p>
<p>As a result, I was quite interested in <a href = "http://mathbabe.org/2011/12/26/a-good-data-scientist-is-hard-to-find/">this post by Cathy O&#8217;Neil</a> about what she does in her role as a data scientist.  She lists four points.  The first two come down to finding ways to make data make sense to businesses and forecasting.</p>
<p>The next two points are these:</p>
<blockquote><p>3. I measure. This is where the old-school statistics comes in, in deciding whether things are statistically significant and what is our confidence interval. It’s related to reporting as well, but it’s a separate task.</p>
<p>4. I help decide whether business ideas are quantitatively reasonable. Will there be enough data to answer this question? How long will we need to collect data to have a statistically significant answer to that? This is kind of like being a McKinsey consultant on data steroids.</p></blockquote>
<p>O&#8217;Neil goes on to note, correctly I think, that most data scientists don&#8217;t view 3 and 4 as part of their job.  Why?</p>
<blockquote><p> It is far less sexy to try to honestly find the confidence interval of a prediction than it is to model behavior. Data scientists are considered magical when they forecast behavior that was hitherto unknown, and they are considered total downers when they tell their CEO, hey there’s just not enough data to start that business you want to start, or hey this data is actually really fat-tailed and our confidence intervals suck.</p></blockquote>
<p>In closing she notes:</p>
<blockquote><p>How do you select for a good data scientist? Look for one that speaks clearly, directly, and emphasizes skepticism. Look for one that is ready to vent about how people trust models too much, and also someone who’s pushy enough to speak up at a meeting and be that annoying person who holds people back from drinking too much kool-aid.</p></blockquote>
<p>I think O&#8217;Neil is right.  But its been my observation that not all that many organizations really want someone who will do just that.  Standing up at a meeting and saying the business model is based on poorly reasoned out conclusions and will probably fail often gets you put on the chopping block during the next round of layoffs.  Especially if you were right.</p>
]]></content:encoded>
			<wfw:commentRss>http://analyticecon.com/blog/?feed=rss2&#038;p=5</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
		<item>
		<title>Benford&#8217;s Law &#8211; A Tool for Hunting for Some Types of Fraud</title>
		<link>http://analyticecon.com/blog/?p=1</link>
		<comments>http://analyticecon.com/blog/?p=1#comments</comments>
		<pubDate>Fri, 14 Oct 2011 18:11:19 +0000</pubDate>
		<dc:creator>Mike Kimel</dc:creator>
				<category><![CDATA[Uncategorized]]></category>

		<guid isPermaLink="false">http://analyticecon.com/blog/?p=1</guid>
		<description><![CDATA[To extract meaningful information from data, it is important to have a good feel for numbers. One of my favorite examples of a &#8220;feel for numbers&#8221; is &#8220;Benford&#8217;s Law.&#8221; Benford&#8217;s law says that if you have a large set of multiple digit numbers, the odds that the first digit of any particular number in that set is 1 is greater than the odds that the first digit in any particular number in that set is 2, which in turn is &#8230; <a href="http://analyticecon.com/blog/?p=1">Continue reading <span class="meta-nav">&#8594;</span></a>]]></description>
			<content:encoded><![CDATA[<p>To extract meaningful information from data, it is important to have a good feel for numbers. One of my favorite examples of a &#8220;feel for numbers&#8221; is &#8220;Benford&#8217;s Law.&#8221; Benford&#8217;s law says that if you have a large set of multiple digit numbers, the odds that the first digit of any particular number in that set is 1 is greater than the odds that the first digit in any particular number in that set is 2, which in turn is higher than the odds that the first digit in any particular number in that set is 3, and so forth.</p>
<p>This relationship has been spotted in all sorts of data, including measures of population, mass of heavenly bodies, and death rates to name a few. Note that this only applies if the numbers are not dimensionless. That is, larger numbers have to relate to more people, more massive objects, greater death rates, or whatever it is that is being measured. Note also that Benford&#8217;s law doesn&#8217;t depend on scale &#8211; when measuring distances, it shows up whether the distances are measured in inches, yards, miles, or metric units.</p>
<p>Why is this interesting? Well, deviations from Benford&#8217;s law usually point to artificial constraints being placed on outcomes. For example, consumer psychology, in the form of a weakness for prices that end in &#8220;99&#8243; can throw the distribution of observed prices off Benford&#8217;s law. But an equally likely artificial constraint is fraud. Benford&#8217;s Law, or rather, deviations from Benford&#8217;s Law have been used as pointers for election or accounting fraud. In one recent study I liked a lot, researchers applied Benford&#8217;s Law to national level data reported by European countries. They found that, of data submitted all European states, <a href="http://onlinelibrary.wiley.com/doi/10.1111/j.1468-0475.2011.00542.x/abstract">Greek data deviated from Benford&#8217;s Law more than data from any other country</a>. No wonder Greece is in trouble.</p>
<p>But I was also interested in another application I recently saw on a blog by <a href="http://econerdfood.blogspot.com/2011/10/benfords-law-and-decreasing-reliability.html">Jialan Wang</a>, a Finance Professor at Washington University in St. Louis. To quote Professor Wang,</p>
<blockquote><p>In fact, Benford&#8217;s law has been used in legal cases to detect corporate fraud, because deviations from the law can indicate that a company&#8217;s books have been manipulated. Naturally, I was keen to see whether it applies to the large public firms that we commonly study in finance.</p>
<p>I downloaded quarterly accounting data for all firms in Compustat, the most widely-used dataset in corporate finance that contains data on over 20,000 firms from SEC filings. I used a standard set of 43 variables that comprise the basic components of corporate balance sheets and income statements (revenues, expenses, assets, liabilities, etc.).</p></blockquote>
<p>And here&#8217;s what she found:</p>
<blockquote><p>Deviations from Benford&#8217;s law have increased substantially over time, such that today the empirical distribution of each digit is about 3 percentage points off from what Benford&#8217;s law would predict. The deviation increased sharply between 1982-1986 before leveling off, then zoomed up again from 1998 to 2002. Notably, the deviation from Benford dropped off very slightly in 2003-2004 after the enactment of Sarbanes-Oxley accounting reform act in 2002, but this was very tiny and the deviation resumed its increase up to an all-time peak in 2009.</p></blockquote>
]]></content:encoded>
			<wfw:commentRss>http://analyticecon.com/blog/?feed=rss2&#038;p=1</wfw:commentRss>
		<slash:comments>0</slash:comments>
		</item>
	</channel>
</rss>

