My colleague and I are doing some webscraping that requires RSelenium. My code previously ran without docker. However, on my new machine it was broken and I had to alter the code to use docker. Her machine also requires docker. Because it’s going so slowly I decided to also run the scrape backwards on my old machine on which I wrote the code (that one sits at home now). Not only does the old machine not require docker, but it won’t run with an active docker session. When I try to open an RSelenium session it says the port is in use. I have to shut down docker to create the RSelenium driver.
My colleague and I have the same OS, R version and RSelenium version. My old machine somehow has newer dependencies, but my colleague also has those newer dependencies. So we’ve ruled out differences in dependency versions.
The old machine previously ran the code on the same internet connection which the new one is on, so we’ve also ruled that out.
The one difference I’ve been able to find is under system preferences the old machine is set to automatically configure IPV6 while on the new one it is set to off. That doesn’t seem consequential and I’d rather not change it mid-scrape.
We’re thoroughly confused. If anyone has thoughts I’d be interested in clearing up the mystery.
These are our OS, R Versions and RSelenium
- R version 3.6.1 (2019-07-05)
- Platform: x86_64-apple-darwin15.6.0 (64-bit)
- Running under: macOS Mojave 10.14.6
- RSelenium Version ‘1.7.5’
These are dependencies with different versions between old and new machine. Hers has the versions which are on the old machine (the one that works without docker):
package | version_old_machine | version_new_machine
“Rcpp” | “1.0.3” | “1.0.2”
- “curl” | “4.3” | “4.0”
- “R6” | “2.4.1” | “2.4.0”
- “caTools” | “18.104.22.168” | “22.214.171.124”