[TriLUG] data set wrangling ---------- Re: TriLUG Digest, Vol 4047, Issue 1
Michael Rulison via TriLUG
trilug at trilug.org
Sat Dec 17 19:30:55 EST 2022
On 12/16/2022 12:00, Joe Purvis, via TriLUG wrote:
> ... My other thought was to see if there was a no-code/low-code solution like Airtable that might be able to put some nice forms in front of 7,000+ records worth of info. Part of the dataset is in a MySQL database, but then we'd have to find something to put in front of it to provide gentle data management/searching...
>
> Any ideas, comments, thoughts, shouts of horror, expressions of sympathy, or suggestions for therapy would be appreciated!
Fools walk in.... I have been working, sporadically, with a data set
that numbers in the scores of thousands of records and a score of vars.
(columns) --- using the R language. After all, if I can learn to put
together some code to make a random sample or partition a set into ranks
based on a metric, like real estate value, hard-core LUGgers should be
able to get over those initial hurdles pretty fast. R just eats up my
data; the slowness of my work is getting syntax right down to the last
character, not to the speed of my processor (4-year-old laptop).
More: R will happily import spreadsheets, CSVs, and even parse text. And
once one has a hunk of code that does what one wants one can reuse it
with updated data, etc. Easy to use regular expressions for complex
searches, etc. Make a monster flat file then rip it apart for various needs.
Yes, it is code but with lots of modules to handle particular issues.
And, using R to produce subsets that then make sense to handle with a
spreadsheet.
--
====================
Michael Rulison
☎ 919 205 9168
More information about the TriLUG
mailing list