Let’s find all of the boats with synthetic aperture radar

It’s been a while! Let’s start with something easy: I was privileged to be a part of the team that put together Satellite mapping reveals extensive industrial activity at sea published last month in Nature. This is part of our ongoing effort to figure out where industrial activity is happening on the oceans. Knowledge about this is surprisingly sparse, but earth observation satellites has improved the situation a lot.

Update to Jupyter on GCE

Quick update: in an earlier post I showed one way to run Jupyter notebooks remotely on GCE. Since then I found there is a simpler way to write the SSH command. Anything after -- in the gcloud compute ssh command is passed directly to ssh. So rather than using multiple instances of --ssh-flag, one can instead use:

gcloud compute ssh img-detection-gpu-3 -- \
        -L 9999:localhost:8888

I’ve also taken to using rmate to use Sublime Text remotely on GCE. In this case the command becomes:

gcloud compute ssh img-detection-gpu-3 -- \
         -L 9999:localhost:8888 \
         -R 52698:localhost:52698

Curve Fitting

A while back a colleague tweaked me with the joke that machine learning is just glorified curve fitting. This is true as far as it goes, but a large, modern neural net (e.g., VGG-16 with 138 million parameters) has approximately the same relationship with a linear fit (2 parameters) that the bomb dropped on Hiroshima (Little Boy with a yield of 63 TJ) had with a stick of dynamite (1 MJ).

The relative danger is almost certainly not as great, but still you are considerably more likely to cause yourselves and others grief with the careless application of modern machine learning methods than with a linear fit.

Jupyter on GCE

I was recently inspired to setup Jupyter to run remotely on a GCE instance. I have access to a lot of computing resources for work, so it’s silly to run things on locally my laptop, but running interactive Python sessions remotely can be painful due to latency and the vagaries of terminals. Running Jupyter seems like a perfect fit here, since the editing is done locally – no lag – and Jupyter can be nicely self documenting for moderate sized projects1)Once projects hit a certain size though, Jupyter becomes inscrutable and really needs to be modularized.

Jeff Delaney has a helpful post on setting Jupyter up GCE and the Jupyter docs on running a public server also have some useful information. However, the solutions for exposing Jupyter to the web were not terribly secure or painful to implement, or both. Since I’m only interested in being able to run the server myself, a simple, relatively secure solution is to use ssh tunneling. So rather than exposing ports publicly on GCE, just start the Jupyter server on your GCE instance with the –no-browser option.

jupyter notebook --no-browser

Then, on your local machine run

gcloud compute ssh nnet-inference \
                       --ssh-flag="-L" \
                       --ssh-flag="9999:localhost:8888"

And point your browser to http://localhost:9999.

That’s it. Now you can use Jupyter remotely without opening up public ports on your GCE instance.  2)A couple of minor notes: I run my notebook inside tmux so that it stays alive if my connection drops. And if the connection drops you’ll need to restart the tunnel.

References   [ + ]

1. Once projects hit a certain size though, Jupyter becomes inscrutable and really needs to be modularized.
2. A couple of minor notes: I run my notebook inside tmux so that it stays alive if my connection drops. And if the connection drops you’ll need to restart the tunnel.

MIA

I’ve been too busy trying to classify fishing vessels for Global Fishing Watch to post lately, but I’m hoping that I’ll have a bit more time now.

A Connection Between RMSPE and the Log Transform

Frequency Content of a Jittered Step