SQL for managing wordlists

A long time ago I started a notebook where I would record passages from books. Most of them are now in the quotes database. I also started collecting interesting words. The first word whose definition I recorded was fatalism: belief that everything is controlled by fat. The first serious words were bellicose: warlike and ameliorate: to make or desire to be better, improve. There are now 1896 interesting words in the database, along with examples of usage.

As part of an app that we were making we assigned a difficulty level to words. There are over 16,000 words that have a difficulty level assigned to them. The code to assign levels to is straightforward since the tables are in the same database.

UPDATE favorite_words, words_by_level 
SET favorite_words.level = words_by_level.level
WHERE favorite_words.word = words_by_level.word

Even though only one table is being updated, you need to mention both in the UPDATE section of the code.

I have another project where I am collecting words and quotes from Nero Wolfe stories. In that case I want to make sure that every word in that table has a definition in the favorite words table. I’m running the code in the Wolfe database and the favorite word table is in a different database so I need to make sure that I prefix the tables in the other database with the database name.

SELECT NWquote.word FROM NWquote 
LEFT JOIN words.favorite_words 
ON words.favorite_words.word = NWquote.word 
WHERE words.favorite_words.word IS NULL

I looked for words in my Wolfe database that did not have a level in my favorite words database and manually updated the level in the favorite words database.

LEFT JOIN words.favorite_words
ON words.favorite_words.word = NWquote.word
WHERE words.favorite_words.level = 0
ORDER BY NWquote.word

Surprisingly, around two hundred of my favorite words still did not have a difficulty level assigned so after manually checking them I assigned the rest to level 6.

UPDATE words.favorite_words, NWdatabase.NWquote
SET words.favorite_words.level = 6 
WHERE words.favorite_words.word = NWdatabase.NWquote.word 
AND words.favorite_words.level = 0

Fixing a MySQL table

I accidentally created a bunch of tables without setting the `id` field as autoincrement, having a primary key, and making the `datetime` field unique. Doing some of these things as separate steps doesn’t work because they are dependent on each other. So I combined them all into one command.

ALTER TABLE `values_Temperature` ADD PRIMARY KEY (`id`), MODIFY `id` INTEGER NOT NULL AUTO_INCREMENT, ADD UNIQUE KEY `datetime` (`datetime`);

Import CSV data into MySQL with PHPMyadmin

I have been getting data for temperature and humidity from the California Irrigation and Management System (CIMIS), importing it into LibreOffice, manipulating the date field, and copying it into BBEdit to convert to sql files for import. That works fine for a years worth of data, but I wanted to get the last five years and not have to do any manual manipulation of the data.

The data from CIMIS comes in as a CSV file with the qc fields mostly blank. This is a problem when the last field is empty since the import function thinks that there is no data there. So I ignore the last field when creating the table. Also the field names have spaces and it is generally not common to use spaces in MySQL field names, so I replaced them with underscores. The qc fields all have the same name so I numbered them.

I want my data to have an ID column but I don’t want to have to add it to the CSV file. The import works fine if it is at the end of the table. You can always move it to the beginning with PHPMyAdmin later.

Stn Id,Stn Name,CIMIS Region,Date,Hour (PST),Jul,ETo (in),qc,Precip (in),qc,Sol Rad (Ly/day),qc,Vap Pres (mBars),qc,Air Temp (F),qc,Rel Hum (%),qc,Dew Point (F),qc,Wind Speed (mph),qc,Wind Dir (0-360),qc,Soil Temp (F),qc
125,Arvin-Edison,San Joaquin Valley,9/30/2014,0100,273,0.00, ,0.00, ,0, ,13.3, ,59.6, ,76, ,52.1, ,3.2, ,156, ,72.1,

  `Stn_Id` int(4),
  `Stn_Name` varchar(50),
  `CIMIS_Region` varchar(50),
  `Date` varchar(10),
  `Hour_(PST)` varchar(4),
  `Jul` float,
  `ETo_(in)` float,
  `qc1` int(4),
  `Precip_(in)` float,
  `qc2` int(4),
  `Sol_Rad_(Ly/day)` float,
  `qc3` int(4),
  `Vap_Pres_(mBars)` float,
  `qc4` int(4),
  `Air_Temp_(F)` float,
  `qc5` int(4),
  `Rel_Hum_(%)` float,
  `qc6` int(4),
  `Dew_Point_(F)` float,
  `qc7` int(4),
  `Wind_Speed_(mph)` float,
  `qc8` int(4),
  `Wind_Dir_(0-360)` float,
  `qc9` int(4),
  `Soil_Temp_(F)` float,

When importing, I needed to compress the data since it exceeds the PHPMyAdmin 2MB limit. I also didn’t want to fool with the file at all so I when importing I skipped the first line using this selection in PHPMyAdmin.

Skip this number of queries (for SQL) starting from the first one: 1

I also used the default CSV import settings. After import I got this error because the last line in the CSV file is empty.
Invalid column count in CSV input on line 45434.

Otherwise the import went fine.

Now I need to get the time and date info into a datetime field. CIMIS is a bit weird in that they treat midnight as 2400 of the current day whereas MySQL treats it as 00:00:00 of the next day. So I just converted all of the 2400 to 2359. Then I add a minute to convert them to MySQL format. Since I asked for data through today, I got all of today’s times—even though there is no data for it. CIMIS puts an ‘M’ in the field which is imported as ‘0’. If air temp, soil temp, and relative humidity are all 0, that means there is no data for that time period, so I deleted those rows.


UPDATE `CIMIS_All` SET `Hour_(PST)` = '2359' WHERE `Hour_(PST)` = '2400';
UPDATE `CIMIS_All` SET `datetime` = STR_TO_DATE(CONCAT(`Date`,`Hour_(PST)`), '%m/%d/%Y%H%i');
UPDATE `CIMIS_All` SET `datetime` = ADDTIME(`datetime`, '0:01:0') WHERE MINUTE(`datetime`) = '59';
DELETE FROM `CIMIS_All` WHERE `Air_Temp_(F)` = 0 && `Soil_Temp_(F)` = 0 && `Rel_Hum_(%)` = 0

Now that I have the data, I just export the fields I need into the tables that are already set up for the locations. I didn’t create a datetime field originally, so I’ll have to create it in the new table. Here’s an example where I import the temperature and humidity. You can import as many fields as you use.

INSERT INTO `CIMIS_Location` (`datetime`,`Air_Temp_(F)`,`Rel_Hum_(%)`) SELECT `datetime`,`Air_Temp_(F)`,`Rel_Hum_(%)` FROM `CIMIS_All`;

You can also automate this process by retrieving an XML file from the CIMIS server. You can retrieve it with this command.

wget -q -O CIMIS_data.xml  "https://et.water.ca.gov/api/data?appKey=YourKey&targets=70,71&startDate=2019-01-01&endDate=2019-01-05"

You can list as many targets as you want. The API key can be gotten from the CIMIS website. Once you have the data, you can parse it using XML parsing tools.

MySQL Datetime

I get temperature data from sensors that report midnight as 00:00:00. I also get data from the State of California CIMIS system and they report midnight as 24:00:00. If I want to merge my data with the state data it doesn’t work. Plus datetime in MySQL isn’t happy about time being 24:00:00.

So to get around that I put the data into my tables with the time portion of the datetime field as 23:59:59. I use BBEdit to edit the raw CSV data so that it can be used in a MySql INSERT statement and it’s just one more find and replace to make it work.

For most purposes that is fine, but there are some calculations that I do on an hourly basis and the last reading of the day is yesterday in the CIMIS data compared to my sensors so I get some slightly different answers. Also, if I am missing data for my sensors, it is difficult to add the missing data.

The solution is obvious, just add a second to the CIMIS data. I couldn’t figure out a way to do that in BBEdit or in LibreOffice.

But after the CIMIS data is in the table you can convert dates with 23:59:59 by adding one second to them with a simple MySQL command.

UPDATE `values_Temperature_1` SET `datetime_temp` = ADDTIME(`datetime`, '0:0:01') WHERE SECOND(`datetime`) = '59'