Blog Articles 161–165

DNT might be useful

I’ve been bearish on Do Not Track. Providing user control over tracking is useful, but DNT has seemed to have exactly zero utility, being the Internet-tracking equivalent of the evil bit. Trackers have little incentive to use it, short of regulatory pressure or just trying to be good people, and the particularly bad actors who I might really want to keep my data away from are never going to respect it even with regulation. So turn on RequestPolicy and forget about it.

Then this crossed my radar. Twitter is personalizing the new-user experience based on tracking data collected via tweet buttons, so your new account automatically has suggestions of people to follow that might actually be relevant. This has huge implications for improving experience for new users; a lot of the benefit of Twitter depends on the streams you follow, and connecting users to relevant streams right off the bat seems like a great way to dramatically improve the quality of the initial experience and thereby improve user retention.

Thanks in part to Do Not Track, Twitter is also building this feature in the most responsible way I can imagine:

  • If you have Do Not Track set in your browser, they don’t track your data.
  • They purge data after 10 days, so they don’t hang on to it indefinitely and your new-user experience is only based on the last 10 days of web visits.
  • They describe the feature, and the data they collect.

Joining Lines with sed

I had a need this week to join lines in a shell script. Specifically, I had a file containing file names, one per line and needed them colon-separated in a single line.

I could have done something in Perl or Awk, or something. But a bit of searching turned up this solution in sed:

sed -e ':a $!N; s/\n/:/; ta' test.txt

sed usually operates a line at a time, so a simple s/\n/:/g won’t work; it will never find the newline characters.

Romney and the Working Poor

The problem is living the dream has blinded [Romney] to other people’s reality. His comments evince no understanding of how difficult it is to focus on college when you’re also working full time, how much planning it takes to reliably commute to work without a car, how awful it is to choose between skipping a day on a job you can’t afford to lose and letting your sick child fend for herself. The working poor haven’t abdicated responsibility for their lives. They’re drowning in it.

What Romney doesn’t understand about personal responsibility — highly recommended read on what life is really like for the working poor.

App.net: Bring on the cross-posts

There is an emerging social norm on app.net, or at least among some of its vocal members, that cross-posting content from Twitter (e.g. w/ IFTTT) is unwelcome.

As I said in that thread, I think this norm is unhelpful for encouraging adoption and building a diverse community and information flow.

Authors/publishers, both in the large and in the small, write and post where there is engagement. If there is no engagement, there is little incentive to post. But also, if there is nothing posted, there is nothing to engage with. Chicken, egg. Vicious or virtuous circle.

Content mirroring provides a way to test the waters, so to speak. It provides a zero-marginal-effort means to get your content onto ADN and see if the community there is interested in it. If there’s engagement, (A) reciprocate the engagement, and (B) consider making dedicated or channel-specific content. This does require that you read ADN, monitor the mentions, etc., but none of that is visible in the “posted with IFTTT” blurb.

RecSys 2012 Preview

I’m headed out today for RecSys 2012, and have my fingers in a number of pies. Places you can see things I’m connected with somehow:

  • I will have a poster for our short paper on identifying when different recommenders make different mistakes (particularly, when is one wrong but another right?) at the poster session.
  • Daniel Kluver is presenting our paper on estimating the information content of ratings. This is a very cool line of work some of our newer students are spearheading, providing interesting tools for measuring and quantifying certain aspects of how recommender systems relate to their users.
  • The demo session is looking amazing (I have the privilege of serving as demos co-chair this year). We’ve got a great mix of academic and industrial projects showing off their work. Really looking forward to it.
  • I’ll be giving two talks at the RecSys Challenge, one about the MovieLens data set and another about LensKit.

Of course, I’ll be around all week. I’ll also be at the Decisions workshop on Sunday.

Run me down & say hi - I’d love to meet you! Or reconnect, if we know each other already.