[TriLUG] R-language problem - maybe code

M.R. via TriLUG trilug at trilug.org
Thu Sep 26 12:55:50 EDT 2019


Using Win 10 and R-Studio I have an instance of code for the state's
registered voters.
Machine has 16 gb of memory (15.5 gb usable). 64-bit OS, 12 core, 2.70
ghz processors.

*v_a_05* is 5 645 868 records with 16 variables. There are 2 other files
of similar size in the global environment, plus numerous other files.

# next to set up another file, with just 16 records and populates full.name with the word squiggle.
v_a_06 = v_a_05[20:35,]
v_a_06$full.name = "squiggle"
# the two above worked.

I did the above as a test because, and this is my problem, the next code
has failed repeatedly, not quitting properly and locking up R studio and
requiring a forced shutdown:

# using full active dataframe for creating the real, desired full.name variable
v_a_06$full.name = v_a_05[1:3, paste(last_name, first_name, middle_name, sep = " ")]
# does not work. 
#Therefore, I tried a subset:
v_a_06$full.name = v_a_05[20:35, paste(v_a_05$last_name, v_a_05$first_name, v_a_05$middle_name, sep = " ")] 
# it too fails

The three names are all in the dataframe with those names and display as
one would expect.

Hypotheses:

 1. I am pushing the machine beyond its memory and cpu limits
 2. My code is syntactically wrong. But show diagnostics only says a
    bunch of symbols are not in scope, including  the three name variables.

    names(v_a_05) [1] "rand" "county_id" "county_desc" "voter_reg_num"
    "status" [6] "last_name" "first_name" "middle_name" "res_city_desc"
    "party" [11] "race" "gender" "yob" "nc.sen.dist" "nc.rep.dist" [16]
    "vtd_abbrv"

 3. other possibilities ?




More information about the TriLUG mailing list