Louis C.K. on Donald Drumpf

Just received an email from Louis C.K. about the latest Horace and Pete episode; but there’s a nice little treat at the end. It’s too good not to share, so I’ve reprinted it in full below. I love this man.

Hello there.  Your name is “there” isn’t it? Anyway hello. I’m writing, of course, to let you know that Horace and Pete episode 6 is available for streaming and download.

Go here to watch it.

This week begins act two. Our guest star is the terrific Hannah Dunne. I think doing this show is the most fun I’ve ever had.

I’d like to also thank everyone in the rest of the world for supporting the show. The show is selling well in England, France, Germany, Denmark, Australia, India, Israel and more. I wish I had the resources to create a subtitled version of the show in every language but it’s already a challenge to shoot the show and get it up on the site so quickly every week.

Also, as the show is not being advertised and promoted anywhere, please share it with your friends and people you think would like it. Please don’t show it to anyone you think would hate it. Although I do believe a show needs to be hated. It’s part of the life of any show to have some people who devote energy to ripping it apart. It’s healthy. Anyway it continues to be very interesting to watch a show spread and grow strictly on word of mouth. And you are the mouths. I mean your mouths are the mouths that… Make words. So please… Word… About it. The show.

To other mouths. I mean don’t talk into people’s mouths though.

Okay. I’m going back to bed. My kids don’t get here for another hour.

Thanks again.

Louis CK

P.S.  Please stop it with voting for Trump. It was funny for a little while. But the guy is Hitler. And by that I mean that we are being Germany in the 30s. Do you think they saw the shit coming? Hitler was just some hilarious and refreshing dude with a weird comb over who would say anything at all.

And I’m not advocating for Hillary or Bernie. I like them both but frankly I wish the next president was a conservative only because we had Obama for eight years and we need balance. And not because I particularly enjoy the conservative agenda. I just think the government should reflect the people. And we are about 40 percent conservative and 40 percent liberal. When I was growing up and when I was a younger man, liberals and conservatives were friends with differences. They weren’t enemies. And it always made sense that everyone gets a president they like for a while and then hates the president for a while. But it only works if the conservatives put up a good candidate. A good smart conservative to face the liberal candidate so they can have a good argument and the country can decide which way to go this time.

Trump is not that. He’s an insane bigot. He is dangerous.

He already said he would expand libel laws to sue anyone who “writes a negative hit piece” about him. He says “I would open up the libel laws so we can sue them and win lots of money. Not like now. These guys are totally protected.” He said that. He has promised to decimate the first amendment. (If you think he’s going to keep the second amendment intact you’re delusional.) And he said that Paul Ryan, speaker of the house will “pay” for criticizing him. So I’m saying this now because if he gets in there we won’t be able to criticize him anymore.

Please pick someone else. Like John Kasich. I mean that guy seems okay. I don’t like any of them myself but if you’re that kind of voter please go for a guy like that. It feels like between him and either democrat we’d have a decent choice. It feels like a healthier choice. We shouldn’t have to vote for someone because they’re not a shocking cunt billionaire liar.

We should choose based on what direction the country should go.

I get that all these people sound like bullshit soft criminal opportunists. The whole game feels rigged and it’s not going anywhere but down anymore. I feel that way sometimes.

And that voting for Trump is a way of saying “fuck it. Fuck them all”. I really get it. It’s a version of national Suicide. Or it’s like a big hit off of a crack pipe. Somehow we can’t help it. Or we know that if we vote for Trump our phones will be a reliable source of dopamine for the next four years. I mean I can’t wait to read about Trump every day. It’s a rush. But you have to know this is not healthy.

If you are a true conservative. Don’t vote for Trump. He is not one of you. He is one of him. Everything you have heard him say that you liked, if you look hard enough you will see that he one day said the exact opposite. He is playing you.

In fact, if you do vote for Trump, at least look at him very carefully first. You owe that to the rest of us. Know and understand who he is. Spend one hour on google and just read it all. I don’t mean listen to me or listen to liberals who put him down. Listen to your own people. Listen to John Mccain. Go look at what he just said about Trump. “At a time when our world has never been more complex or more in danger… I want Republican voters to pay close attention to what our party’s most respected and knowledgeable leaders and national security experts are saying about Mr. Trump, and to think long and hard about who they want to be our next Commander-in-Chief and leader of the free world.”

When Trump was told what he said, Trump said “Oh, he did? Well, that’s not nice,” he told CBS News’ chief White House correspondent Major Garrett. “He has to be very careful.”

When pressed on why, Trump tacked on: “He’ll find out.”

(I cut and pasted that from CBS news)

Do you really want a guy to be president who threatens John McCain? Because John McCain cautiously and intelligently asked for people to be thoughtful before voting for him? He didn’t even insult Trump. He just asked you to take a good look. And Trump told him to look out.

Remember that Trump entered this race by saying that McCain is not a war hero. A guy who was shot down, body broken and kept in a POW camp for years. Trump said “I prefer the guys who don’t get caught.” Why did he say that? Not because he meant it or because it was important to say. He said it because he’s a bully and every bully knows that when you enter a new school yard, you go to the toughest most respected guy on the yard and you punch him in the nose. If you are still standing after, you’re the new boss. If Trump is president, he’s not going to change. He’s not going to do anything for you. He’s going to do everything for himself and leave you in the dust.

So please listen to fellow conservatives. But more importantly, listen to Trump. Listen to all of it. Everything he says. If you liked when he said that “torture works” then go look at where he took it back the next day. He’s a fucking liar.

A vote for Trump is so clearly a gut-vote, and again I get it. But add a little brain to it and look the guy up. Because if you vote for him because of how you feel right now, the minute he’s president, you’re going to regret it. You’re going to regret it even more when he gives the job to his son. Because American democracy is broken enough that a guy like that could really fuck things up. That’s how Hitler got there. He was voted into power by a fatigued nation and when he got inside, he did all his Hitler things and no one could stop him.

Again, I’m not saying vote democrat or vote for anyone else. If Hilary ends up president it should be because she faced the best person you have and you and I both chose her or him or whoever. Trump is not your best. He’s the worst of all of us. He’s a symptom to a problem that is very real. But don’t vote for your own cancer. You’re better than that.

That’s just my view. At least right now. I know I’m not qualified or particularly educated and I’m not right instead of you. I’m an idiot and I’m sure a bunch of you are very annoyed by this. Fucking celebrity with an opinion. I swear this isn’t really a political opinion. You don’t want to know my political opinions.   (And I know that I’m only bringing myself trouble with this shit.) Trump has nothing to do with politics or ideology. He has to do with himself. And really I don’t mean to insult anyone. Except Trump. I mean to insult him very much. And really I’m not saying he’s evil or a monster. In fact I don’t think Hitler was. The problem with saying that guys like that are monsters is that we don’t see them coming when they turn out to be human, which they all are. Everyone is. Trump is a messed up guy with a hole in his heart that he tries to fill with money and attention. He can never ever have enough of either and he’ll never stop trying. He’s sick. Which makes him really really interesting. And he pulls you towards him which somehow feels good or fascinatingly bad. He’s not a monster. He’s a sad man. But all this makes him horribly dangerous if he becomes president. Give him another TV show. Let him pay to put his name on buildings. But please stop voting for him. And please watch Horace and Pete.

Earthquake catalog for Southern California updated



For anyone interested in high resolution earthquake locations and focal mechanisms in southern California, I’ve updated my R-package yhs.catalog to include the latest update to the Yang, Haukkson, and Shearer (2012) catalog from the Southern California Earthquake Data Center.

There are currently 196,993 earthquakes from 1981 through 2014 in the catalog:


Earthquakes in southern California since 1981. (Check out the github page in the link for code to reproduce this figure.)

Use devtools to install:

if (!require(devtools)) install.packages("devtools")


Release of psd 1.0 to CRAN

Power spectral density estimates of the Project MAGNET datasets with psd compared to those from stats::spectrum.
Power spectral density estimates of the Project MAGNET datasets with psd compared to those from stats::spectrum.
Power spectral density estimates of the Project MAGNET dataset from  psd::pspectrum (lines), compared to those from stats::spectrum (points).

Greetings, Interweb!

I’m pleased to announce psd 1.0, a long-overdue major update from the 0.* series which includes significant advancements in performance, improved clarity and consistency of documentation and method/class handling, and the elimination a few long-standing bugs.

Some major changes include:

  • Most importantly, all major bottlenecks have been eliminated with new c++ codes: the adaptive method is much faster now. Thanks to Dirk, Romain, and any other Rcpp (and RcppArmadillo) contributors for building such a fantastic package!
  • Unit testing: I’ve put in place the framework for unit-testing with testthat; so far there are only a few tests, but I’ll be adding more in the future. Thanks to Hadley and RStudio crew for yet another fantastic package!
  • Travis CI: automatic build-checking is done on each commit to the codebase. (How that system works so well is amazing.)
  • Dependency on fftw dropped: it’s been a frustrating process trying to ensure that the fftw dependency would be satisfied — my requests have generally fallen on deaf ears — so I dropped it; instead psd now uses good ole’ stats::fft, despite the speed disadvantage for very long timeseries. I justify this by noting the dramatic speed increases afforded by the new c++ code, and that only a single DFT needs to be calculated during the adaptive procedure.
  • Loss of some backwards compatibility: unfortunately I’ve been unable to reconcile changes with the silly things I built into the original versions, so you may find that scripts written for 0.4.* fail — if this is a major issue feel free to get in touch with me and hopefully we can straighten things out.

Please submit Issues/Requests through github. And, as always, I’m happy to supply a reprint of the journal article accompanying the package.


Time to Accept It: publishing in the Journal of Statistical Software

When I was considering submitting my paper on psd to J. Stat. Soft. (JSS), I kept noticing that the time from “Submitted” to “Accepted” was nearly two years in many cases.  I ultimately decided that was much too long of a review process, no matter what the impact factor might be (and in two years time, would I even care?).  Tonight I had the sudden urge to put together a dataset of times to publication.

Fortunately the JSS website is structured such that it only took a few minutes playing with XML scraping (*shudder*) to write the (R) code to reproduce the full dataset.  I then ran a changepoint (published in JSS!) analysis to see when shifts in mean time have occurred.  Here are the results:


Top: The number of days for a paper to go from 'Submitted' to 'Accepted'.  Middle: In log2(time), with lines for one month, one year, and two years. Bottom frame: changepoint analyses.
Top: The number of days for a paper to go from ‘Submitted’ to ‘Accepted’ as a function of the cumulative issue index (each paper is an “issue” in JSS).
Middle: In log2(time), with lines for one month, one year, and two years.
Bottom frame: changepoint analyses.

Pretty interesting stuff, but kind of depressing: the average time it takes to publish is about 1.5 years, with a standard deviation of 206 days.  There are many cases where the paper review is <1 year, but those tend to be in the ‘past’ (prior to volume 45, issue 1).

Of course, these results largely reflect an increase in academic impact (JSS is becoming more impactful), which simultaneously increases the number of submissions for the editors to deal with.  So, these data should be normalized by something.  By what, exactly, I don’t know.

And, finally, I can’t imagine how the authors of the paper that went through a 1400+ day review process felt — or are they still feeling the sting?

Here’s my session info:

R version 3.1.0 (2014-04-10)
Platform: x86_64-apple-darwin13.1.0 (64-bit)

[1] en_US.UTF-8/en_US.UTF-8/en_US.UTF-8/C/en_US.UTF-8/en_US.UTF-8

attached base packages:
[1] stats graphics grDevices utils datasets methods base

other attached packages:
[1] changepoint_1.1.5 zoo_1.7-11 plyr_1.8.1 XML_3.98-1.1

loaded via a namespace (and not attached):
[1] grid_3.1.0 lattice_0.20-29 Rcpp_0.11.1 tools_3.1.0
And here's the R-code needed to reproduce the dataset and figure:

#Current Volume:
cvol <- 58
# set to 'TRUE' if you want to
# reproduce the dataset with each
# run (not recommended)
redo <- FALSE

jstat.xml <- function(vol=1, tolist=TRUE){
 src <- "http://www.jstatsoft.org/"
 vsrc <- sprintf("%sv%i",src,vol)
 X <- xmlParse(vsrc)
 if (tolist) X <- xmlToList(X)

jstat.DTP <- function(vol=1, no.authors=FALSE, no.title=FALSE){
 # Get article data
 xl <- jstat.xml(vol)
 Artic <- xl$body[[4]]$div$ul # article data

 # Vol,Issue
 issues <- ldply(Artic, function(x) return(x[[5]][[1]]))$V1
 issues2 <- ldply(strsplit(issues,split=","),
 .fun=function(x){gsub("\n Vol. ","",gsub("Issue ","",x))})

 # Accepted
 dates <- ldply(Artic, function(x) return(x[[6]][[1]]))$V1
 dates2 <- ldply(strsplit(dates, split=","),
 .fun=function(x){as.Date(gsub("Accepted ","",gsub("Submitted ","",x)))})

 Dat <- data.frame(Volume=issues2$V1, Issue=issues2$V2, Date=issues2$V3,
 Submitted=dates2$V1, Accepted=dates2$V2,
 Days.to.pub=as.numeric(difftime(dates2$V2, dates2$V1, units="days")),
 Author=NA, Title=NA)

 if (!no.authors){
 # Authors
 Dat$Author <- ldply(Artic, function(x) return(x[[3]][[1]]))$V1

 if (!no.title){
 # Title
 Dat$Title <- ldply(Artic, function(x) return(x[[1]][[1]]))$V1


# Shakedown
#ldply(57:58, jstat.DTP)

if (!exists("Alldata") | redo){
 Alldata <- ldply(seq_len(cvol), jstat.DTP)
 save(Alldata, file="JStatSoft_DtP.rda")
} else {

Alldata.s <- arrange(Alldata, Days.to.pub)
Cpt <- suppressWarnings(cpt.mean(Alldata$Days.to.pub, method="SegNeigh", Q=4))
niss <- length(Cpt@data.set)


PLT <- function(){
 par(las=1, cex=0.8)

 with(Alldata, {
 xlim=c(0,niss), #xaxs="i",
 type="l", col="grey", ylab="Days", xaxt="n",
 points(Days.to.pub, pch=3)
 mtext("Time to 'Accepted': J. Stat. Soft.", font=2, line=0.5)
 axis(1, labels=FALSE)
 xlim=c(0,niss), #xaxs="i",
 pch=3, ylab="log2 Days")
 yt <- log2(356*c(1/12,1:2))
 abline(h=yt, col="red", lty=2)

 xlim=c(0,niss), #xaxs="i",
 xlab="Issue index", ylab="Days",
 mtext("Changes in mean", font=2, line=-2, adj=0.1, cex=0.8)

 Dm <- lbls <- round(param.est(Cpt)$mean)
 lbls[1] <- paste("section mean:",lbls[1])
 Lc <- cpts(Cpt)
 yt <- -75-max(Dm)+Dm
 xt <- c(0,Lc)+diff(c(0,Lc,niss))/2

 text(xt, yt, lbls, col="red", cex=0.9)


Row-wise summary curves in faceted ggplot2 figures

I really enjoy reading the Junk Charts blog.  A recent post made me wonder how easy it would be to add summary curves for small-multiple type plots, assuming the “small multiples” to summarize were the X component of a ggplot2::facet_grid(Y ~ X) layer.  In other words, how could I plot the same summary curve across each row of the faceted plot?

First we need some data.  I have been working on a spectrum estimation tool with Robert Parker, and ran some benchmark tests of the core function against a function with similar functionality, namely spec.mtm in the multitaper package.

The benchmarking was done using the rbenchmark package.  In short, I generate an auto-regressive simulation using arima.sim(list(order = c(1,1,0), ar = 0.9), n), and then benchmark the functions for incremental increases in n (the length of the simulated set);  here is the resulting information as an R-data file. (I’m not showing the code used to produce the data, but if you’re curious I’ll happily provide it.)

With a bit of thought (and trial-and-effort for me), I found Hadley’s reshape2 and plyr packages made it straightforward to calculate the group statistics (note some prior steps are skipped for brevity, but the full code is linked at the end):

## reduce data.frame with melt
allbench.df.mlt <- reshape2::melt(allbench.df.drp, id.vars=c("test","num_terms"))

## calculate the summary information to be plotted:
## 'value' can be anything, but here we use meadian values from Hmisc::smean.cl.normal, which calculates confidence limits using a t-test
## 'summary' is not important for plotting -- it's just a name
tmpd <- plyr::ddply(allbench.df.mlt, .(variable, num_terms), summarise, summary="medians", value=mean_cl_normal(value)[1,1])

## create copies of 'tmpd' for each test, and map them to one data.frame
tests <- unique(allbench.df$test)
allmeds <- plyr::ldply(lapply(X=tests, FUN=function(x,df=tmpd){df$test <- x; return(df)}))

Here’s the final result, after adding a ggplot2::geom_line layer with the allmeds data frame to the faceted plot:


This type of visualization helps visually identify differences among subsets of data.  Here, the lines help distinguish the benchmark information by method (facet columns).  Of course the stability of benchmark data depends on the number replications, but here we can see the general shape of the user.self and elapsed times are consistent across the three methods, and that the rlpSpec methods consume less sys.self time with increasing series length.  Most surprising to me is the convergence of relative times with increasing series length.  When the number of terms is more than approximately 5000, the methods have roughly equal performance; below this threshold the spec.mtm method can be upwards of 2-3 times faster, which should not be too surprising given that it calls Fortran source code.

I assume there is a slick way to do this with ggplot2::stat_summary, but I was scratching my head trying to figure it out.  Any insight into a better or easier way to do this is especially welcome!

Here is the code to produce the figure, as a gist.  If you have any troubles accessing the data, please let me know.