Extracting values from js source or HTML tags with R?

84
July 22, 2019, at 2:00 PM

I'm trying to create a pipeline for my SQL database to contain all of the players who have played in the NBA with their corresponding unique player ID's (as shown in the image below) using this webpage.

How the ID's Manifest Themselves

I was able to successfully do it in python (to create a CSV instead) while manually creating a list with a variable from the stats_ptsd.js file I found in network responses once I inspected the page. I'm not showing this python code because it is not scraping the page but instead referencing this manually copied list.

Network Responses

How the CSV Looks

Now I'm not sure how to scrape the information with R. I've tried a ton of different methods I've seen across the internet, many using the rvest package, but to no avail. I haven't had any meaningful output or error message to show for now. Hopefully, someone has a suggestion for the best way to do this, whether by accessing the .js file or scraping the HTML elements. The xpath for a player's 'a' HTML element with the valid href is shown below.

//*[contains(concat( " ", @class, " " ), concat( " ", "players-list__name", " " )) and (((count(preceding-sibling::*) + 1) = 91) and parent::*)]//a
Answer 1

The data is coming from a js file you can find in the network tab. You can regex or substring out the javascript dictionary within and parse with a json parser.

library(rvest)
library(stringr)
library(magrittr)
library(jsonlite)
r <- read_html('https://stats.nba.com/js/data/ptsd/stats_ptsd.js') %>%
  html_node('body') %>%
  html_text() %>%
  toString()
data <- str_match_all(r,'stats_ptsd = (.*);')
data <- data.frame(jsonlite::fromJSON(data[[1]][,2])$data$players)
write.csv(data,file="players.csv")

You could also subset and re-order before writing out:

df <- setNames(data[,c("X2","X1")],c("Name","Id"))
write.csv(df,file="players.csv")

References:

  1. https://github.com/yusuzech/r-web-scraping-cheat-sheet/blob/master/README.md#rvest6.1
READ ALSO
Detect if mobile user and desktop is same

Detect if mobile user and desktop is same

I want to detect users coming on my website and check the repeated usersSo, if a user access my website from desktop browser and then let's suppose next day same user access my site from mobile browser

89
Hide Load More buttons when all items have been rendered in Ruby on Rails

Hide Load More buttons when all items have been rendered in Ruby on Rails

I'm currently trying to implement the Load More button in Ruby on RailsI've managed to implement it

114
JavaScript: How do I pass a string argument to a function to use to sort an array of objects by the value of that string argument?

JavaScript: How do I pass a string argument to a function to use to sort an array of objects by the value of that string argument?

I am also working in ReactI am trying to build a reusable sorting component, which sorts an array of objects by a property field

120
Building an API in Express.js, Getting &ldquo;Cannot GET /api&rdquo;

Building an API in Express.js, Getting “Cannot GET /api”

Getting Cannot GET /api when I put in my query, I go to localhost:3000 and get what I expect returned which is "Random User API" but when I put in my query http://localhost:3000/api?results=100I get the Cant GET/api

45