Estimating the Trend at Any Point in a Noisy Time Series

17 Apr

Trends in time series are valuable. If the cost of a product rises suddenly, it likely indicates a sudden shortfall in supply or a sudden rise in demand. If the cost of claims filed by a patient rises sharply, it plausibly suggests rapidly worsening health.

But how do we estimate the trend at a particular time in a noisy time series? The answer is simple: smooth the time series using any one of the many methods, local polynomials or via GAMs or similar such methods, and then estimate the derivative(s) of the function at the chosen point in time. Smoothing out the noise is essential. If you don’t smooth and instead go with a naive estimate of the derivative, it can be heavily negatively correlated with derivatives gotten from smoothed time series. For instance, in an example we present, the correlation is –.47.

Clarification

Sometimes we want to know what the “trend” was over a particular time window. But what that means is not 100% clear. For a synopsis of the issues, see here.

Python Package

incline provides a couple of ways of approximating the underlying function for the time series:

  • fitting a local higher order polynomial via Savitzky-Golay over a window of choice
  • fitting a smoothing spline

The package provides a way to estimate the first and second derivative at any given time using either of those methods. Beyond these smarter methods, the package also provides a way a naive estimator of slope—average change when you move one-step forward (step = observed time units) and one-step backward. Users can also calculate the average or maximum slope over a time window (over observed time steps).

Rocks and Scissors for Papers

17 Apr

Zach and Jack* write:

What sort of papers best serve their readers? We can enumerate desirable characteristics: these papers should

(i) provide intuition to aid the reader’s understanding, but clearly distinguish it from stronger conclusions supported by evidence;

(ii) describe empirical investigations that consider and rule out alternative hypotheses [62];

(iii) make clear the relationship between theoretical analysis and intuitive or empirical claims [64]; and

(iv) use language to empower the reader, choosing terminology to avoid misleading or unproven connotations, collisions with other definitions, or conflation with other related but distinct concepts [56].

Recent progress in machine learning comes despite frequent departures from these ideals. In this paper, we focus on the following four patterns that appear to us to be trending in ML scholarship:

1. Failure to distinguish between explanation and speculation.

2. Failure to identify the sources of empirical gains, e.g. emphasizing unnecessary modifications to neural architectures when gains actually stem from hyper-parameter tuning.

3. Mathiness: the use of mathematics that obfuscates or impresses rather than clarifies, e.g. by confusing technical and non-technical concepts.

4. Misuse of language, e.g. by choosing terms of art with colloquial connotations or by overloading established technical terms.

Funnily Zach and Jack fail to take their own advice, forgetting to distinguish between anecdotal evidence (they claim a ‘troubling trend’ without presenting systematic evidence for it). But the points they make are compelling. The second and third points are especially applicable to economics though they apply to a lot of scientific production.


* It is Zachary and Jacob.

What Clicks With the Users? Maximizing CTR

17 Apr

Given a pool of messages, how can you maximize CTR?

The problem of maximizing CTR reduces to the problem of estimating the probability that a person in a specific context will click on each of the messages. Once you have the probabilities, all you need to do is apply the max operator and show the message with the highest probability. Technically, you don’t need to get the point estimates right—you just need to get the ranking right.

Abstracting out, there are four levers for increasing CTR:

  1. Better models and data: Posed as a supervised problem, we are aiming to learn clicks as a function of a) the kind of content, b) the kind of context, and c) the kinds of people. (And, of course, interactions between all three are included.) To learn preferences well, we need to improve your understanding of the content, context, and kinds of people. For instance, to understanding content more finely, you may need to code font size, font color, etc.
  2. Modeling externalities (user learning): It sounds funny when you say that CTR of a system that shows no messages to some people some of the time can be better than a system that shows at least some message to everyone every time they log in. But it can be true. If you need to increase CTR over longer horizons, you need to be able to model the impact of showing one message on a person opening another message. If you do that, you may realize that the best option is to not even show a message this time. (The other way you could ‘improve’ CTR is by losing people—you may lose people you bombard with irrelevant messages and the only people who ‘survive’ are those who like what you send.)
  3. Experimenting With How to Present a Message: Location on the webpage, the font, etc. all may matter. Experiment to learn.
  4. Portfolio: This let’s go of the fixed portfolio. Increase your portfolio of messages so that you have a reasonable set of things for everyone. It is easy enough to mistake people dismissing a message with disinterest in receiving messages. Don’t make the mistake. If you want to learn where you are failing, find out for which kinds of people you have the lowest (calibrated) probability scores for and think hard about what kinds of messages will appeal to these kinds of people.

Quitting at 40

6 Apr

Recently, I had the pleasure of interviewing Walter Guillioli. Walter is one of those few brave people who has had the courage to take the reins of his life. Walter carefully and smartly worked to save enough to live off the savings and then quit a well-paying job at 40 to live his life.

GS: Tell us a bit more about yourself.

I grew up in a middle-income family in Guatemala. I am the youngest of five. Growing up, I enjoyed getting into trouble.

From a young age, I was taught that education is important. I studied Computer Science in college. And later, I was fortunate to get a full scholarship from the Dutch government to get an MBA.

I worked in Marketing for 10+ years until I got bored and decided to switch careers to data science. I just finished a Master of Science in Data Science from Northwestern while working full-time.

I love animals, the outdoors and the simple things in life like camping and good scenery. I also like to push myself in sports because it humbles me and helps me build character. I got a black belt in Tae Kwon Do at 38, and I am currently training for ultra-running trail races.

GS: Why did you decide to quit working full-time at 40?

WG: It is a combination of factors, but it is mostly a result of intellectual boredom and a desire to spend my time on earth doing things I love, and to not just “survive” life.

I have always questioned the purpose of (my) life and never liked the cycle most follow: study > work > get married & have kids > consume > be “busy” > (maybe get free time at old age) > die.

Professionally, I have done relatively well. Searching “success,” I have found my dream job three times. However, each time I found my “dream job,” the excitement faded away quickly as I spent most of my time surviving meetings and going through the grind of corporate overhead. I never understood all the stress for work that I didn’t think added much value. I love intellectual challenges and good work, but it was hard to find it in a big corporation.

One of my favorite quotes in Spanish translates roughly to “the richest person is not the one that has the most but the one that knows how to desire less.” And between spending my time in a cubicle working on stuff that didn’t matter to me and buying things I didn’t need, I decided to buy my time and freedom to do what I want.

I decided with my wife to live a simpler life and to move closer to nature and the mountains. I decided to spend more time with my family and raise my 2-year old. I decided that each day I will pick what to do – whether it is going for a trail run (I am training for a 52-mile run) or riding my mountain bike or dirt bike or simply walking my dogs for a few hours or playing with my son and wife in a park or just reading a book.

I will work on projects. I will just work on stuff that matters to me. I want to occasionally freelance on data science projects and contribute to the world. I am also considering personal finance advising to help people.

GS: Tell us a bit more about how you planned your retirement.

WG: I never had a master plan. It has been a learning process with mistakes along the way.

The most important thing for me was changing the mindset about money. I never paid much attention to money. I spent it relatively mindlessly. However, after reading articles like this one, I realized that money is a tool to buy my time and freedom. I can’t think of anything better that money can buy.

So, we focused on understanding our expenses and figuring out ways to reduce them. It’s not about being cheap but about spending intentionally. We also started saving and investing as much as possible on index funds. The end goal became having enough money invested that we could cover our annual expenses from its interest.

GS: What’s your advice for people looking to do the same?

WG:

  1. Track and understand your annual expenses with a tool like Quicken or Mint.
  2. Save as much as you can and invest in index funds. Don’t worry about timing the market (it doesn’t work) or about having the perfect portfolio. Start investing in a broad index fund like Vanguard’s VTSAX and get a bit more sophisticated later. Learn more here.
  3. Make a list of things that truly bring you happiness and contrast that with your spending.
  4. Avoid “lifestyle inflation.” And don’t try to keep up with your neighbors. Nothing will be ever enough.
  5. Read these books: Little Book of Common Sense Investing, Simple Path to Wealth, Your Money or Your Life, Four Pillars of Investing.
  6. Read these blogs: Mr. Money Mustache, Mad Fientist
  7. Listen to the ChooseFI podcast.
  8. If you are married, make sure that everyone is onboard.
  9. Have savings targets and automate everything around it so that you pay yourself first.

Citing Working Papers

2 Apr

Public versions of working papers are increasingly the norm. So are citations to them. But there are four concerns with citing working papers:

  1. Peer review: Peer review improves the quality of papers, but often enough it doesn’t catch serious, basic issues. Thus, a lack of peer review is not as serious a problem as is often claimed.
  2. Versioning: Which version did you cite? Often, there is no canonical versioning system. The best we have is tracking which conference was the paper presented at. This is not good enough.
  3. Availability: Can I check the paper, code, and data for a version? Often enough, the answer is no.

The solution to the latter two is to increase transparency through the entire pipeline. For instance, people can check how my paper with Ken has evolved on Github, including any coding errors that have been fixed between versions. (Admittedly, the commit messages can be improved. Better commit messages—plus descriptions—can make it easier to track changes across versions.)

The first point doesn’t quite deserve addressing in that the current system draws an optimistic line on the quality of published papers. Peer review ought not to end when a paper is published in a journal. If we accept that, then all concerns flagged by peers and non-peers can be addressed in various commits or responses to issues and appropriately credited.

A/B Testing Recommendation Systems

1 Apr

Say that you are building a news recommender that lists which relevant news items in each person’s news feed. Say that your first version of the news recommender is a rules-based system that uses signals like how many people in your network have seen the news, how many people in total have read the news, the freshness of the news, etc., and sums up the signals in an arbitrary way to rank news items. Your second version uses the same signals but uses a supervised model to decide on the optimal weights.

Say that you find that the recommendations vary a fair bit between the two systems. But which one is better? To suss that, you conduct an A/B test. But a naive experiment will produce biased estimates of the effect and the s.e. because:

  1. The signals on which your control group ranking system on is based are influenced by the kinds of news articles that people in treatment group see. And vice versa.
  2. There is an additional source of stochasticity in recommendations that people see: the order in which people arrive matters.

The effect of the first concern is that our estimates are likely attenuated.  To resolve the first issue, show people in the Control Group news articles based on predicted views of news articles based on historical data or pro-rated views of people assigned to control group alone. (This adds a bit of noise to the Control Group estimates.) And keep a separate table of input data for the treatment group and apply the ML model to the pro-rated data from that table.

The consequence of the second issue is that our s.e. is very plausibly much larger than what we will get with the split world testing (each condition gets its own table of counts for views, etc.). The sequence in which people arrive matters as it intersects with social influence world. To resolve the second issue, you need to estimate how the sequence of arrival affects outcomes. But given the number of pathways, the best we can probably do is bound. We could probably estimate the effect of ranking the least downloaded item first as a way to bound the effects.

Advice that works

31 Mar

Writing habits of some writers:

“Early in the morning. A good writing day starts at 4 AM. By 11 AM the rest of the world is fully awake and so the day goes downhill from there.”

Daniel Gilbert

“Usually, by the time my kids get off to school and I get the dogs walked, I finally sit down at my desk around 9:00. I try to check my email, take care of business-related things, and then turn it off by 10:30—I have to turn off my email to get any writing done.”

Juli Berwald

“When it comes to writing, my production function is to write every day. Sundays, absolutely. Christmas, too. Whatever. A few days a year I am tied up in meetings all day and that is a kind of torture. Write even when you have nothing to say, because that is every day.”

Tyler Cowen

“I don’t write everyday. Probably 1-2 times per week.”

Benjamin Hardy

“I’ve taught myself to write anywhere. Sometimes I find myself juggling two things at a time and I can’t be too precious with a routine. I wrote Name of the Devil sitting on a bed in a rented out room in Hollywood while I was working on a television series for A&E. My latest book, Murder Theory, was written while I was in production for a shark documentary and doing rebreather training in Catalina. I’ve written in casinos, waiting in line at Disneyland, basically wherever I have to.”

Andrew Mayne

Should we wake up at 4 am and be done by 11 am as Dan Gilbert does or should we get started at 10:30 am like Juli, near the time Dan is getting done for the day? Should we write every day like Tyler or should we do it once or twice a week like Benjamin? Or like Andrew, should we just work on teaching ourselves to “write anywhere”?

There is a certain tautological aspect to good advice. It is advice that works for you. Do what works for you. But don’t assume that you have been given advice that is right for you or that it is the only piece of advice on that topic. Advice givers rarely point out that the complete set of reasonable things that could work for you is often pretty large and contradictory and that the evidence behind the advice they are giving you is no more than anecdotal evidence with a dash of motivated reasoning.

None of this to say that you should not try hard to follow advice that you think is good. But once you see the larger point, you won’t fret as much when you can’t follow a piece of advice or when the advice doesn’t work for you. As long as you keep trying to get to where you want to be (and of course, even the merit of some wished for end states is debatable), it is ok to abandon some paths, safe in the knowledge that there are generally more paths to get there.

Stemming Link Rot

23 Mar

The Internet gives many things. But none that are permanent. That is about to change. Librarians got together and recently launched https://perma.cc/ which provides a permanent link to stuff.

Why is link rot important?

Here’s an excerpt from a paper by Gertler and Bullock:

“more than one-fourth of links published in the APSR in 2013 were broken by the end of 2014”

If what you are citing evaporates, there is no way to check the veracity of the claim. Journal editors: pay attention!

countpy: Incentivizing more and better software

22 Mar

Developers of Python packages sometimes envy R developers for the simple perks they enjoy, like a reliable web service that gives a reasonable fill-in for the total number of times an R package has been downloaded. To achieve the same, Python developers need to do a Google BigQuery (which costs money) and wait for 30 or so seconds.

Then there are sore spots that are shared by all developers. Downloads are a shallow metric. Developers often want to know how often other people writing software use their package. Without such a number, it is hard to defend against accusations like, “the total number of downloads are unreliable because they can be padded by numerous small releases,” “the total number of downloads doesn’t reflect how often people use the software,” etc. We partly solve this problem for Python developers by providing a website that tallies how often a package is used in repositories on Github, the largest open-source software hosting platform. http://countpy.com provides the total number of times a package has been called in the requirements file and in the import statement in files in Python language repositories. (At the time of writing, the crawl is incomplete.)

The net benefit (loss) of a piece of software is, of course, greater than mere counts of how many people use it directly in the software they build. We don’t yet count indirect use: software that uses software that uses the software of interest. Ideally, we would like to tally the total time saved, the increase in the number of new projects started, projects which wouldn’t have started had the software not been there, impact on style in which other code is written, and such. We may also need to tally the cost of errors in the original software. To the extent that people don’t produce software because they can’t be credited reasonably for it, better metrics about the impact of software can increase the production of software and increase the quality of the software that is being provided.

Searching for Great Conversations

21 Mar

“When was the last time you had a great conversation? A conversation that wasn’t just two intersecting monologues, but when you overheard yourself saying things you never knew you knew, that you heard yourself receiving from somebody words that found places within you that you thought you had lost, and the sense of an eventive conversation that brought the two of you into a different plain and then fourthly, a conversation that continued to sing afterward for weeks in your mind? Conversations like that are food and drink for the soul.”


John O’Donahue h/t David Perell