Page 1 of 1

Recovering from bad data import

Posted: Wed Nov 01, 2017 5:23 pm
by SteveWx
Due to a severe storm, I was knocked offline for a couple of days. In any event, I did a test import which looked terrific, so I imported the records from WeatherDisplay and everything did not follow the pattern of the first record? Here is the test import:
Summary
The following would be imported to database:

Date/time: 2017-10-29 23:30
Temperature: 62.4 F
Dewpoint: 62.06 F
Apparent Temperature: 62.6 F
Humidity: 99 %
Pressure: 29.37 inhg
Wind speed: 7.0 mph
Wind gust: 23.0 mph
Precipitation: 0.27 in
Rain rate: 0.10 in/h
Wind direction: 90 degrees
Solar radiation: No solar sensor - ignored.

But now I have crazy record pressure of 99inHg, temperature readings in some wind fields (not all), 20 inches of rain in October, and everything is a complete mess... It is as though the field numbers did not retain their same position throughout the import. Anyone have any advise or suggestions other than abandoning ship? Thanks.

Re: Recovering from bad data import

Posted: Wed Nov 01, 2017 7:35 pm
by dmgould
Hopefully you made a backup of the database before you did an import. I ran into this sort of problem at the beginning trying to import data from WU. If you made a backup I'd carefully setup the import again and try a small sample.

Re: Recovering from bad data import

Posted: Wed Nov 01, 2017 11:31 pm
by Luc
Hi Steve,
Can you send me your import file (to ljm.heijst@gmail.com) together with the settings you applied to the import program?
May be we can sort things out.
If necessary you can delete the period with bad data from your MT database.
Cheers, Luc

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 1:12 pm
by SteveWx
Thanks Luc, but I think I have figured out where the process is failing. WeatherDisplay doesn't seem to use forwarding zeros, and the Meteotemplate space delimiter parsing script doesn't handle that well. In this example, the time (minute) is in field 5 at 10:09AM and in field 4 at 10:10AM. There are 20 fields at 10:09AM and 19 fields at 10:10AM, which causes importing chaos. Please note- the Meteotemplate forum formatting deletes the extra space in this example text:

30 10 2017 10 9 59.6 88 56.0 29.084 4.0 15.0 180 0.000 1.539 8.039 38.197 59.6
30 10 2017 10 10 59.6 88 56.0 29.084 4.0 15.0 180 0.000 1.539 8.039 38.197 59.6

Field Number Value (10:09AM)
0 30
1 10
2 2017
3 10
4 null
5 9
6 59.6

Field Number Value (10:10AM)
0 30
1 10
2 2017
3 10
4 10
5 59.6
6 null

I'm surprised if I am the first to encounter this- what do other WeatherDisplay users do to catch up missing data?

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 1:27 pm
by Jachym
From what you posted I dont see an issue, both rows have equal number of fields

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 1:27 pm
by Jachym
If I remember correctly WD delimiter is not space, but tab

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 1:44 pm
by SteveWx
Maybe this will illustrate it better- as I noted, the formatting in the forum truncated the extra space:

Image

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 2:23 pm
by Jachym
Yes, use "Tab" as separator, this will work because tab is a single or several subsequent spaces (solving this problem)

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 6:46 pm
by SteveWx
Using tab as the delimiter instead of a space worked much better! Thanks for the bulk database editor too Jachym, it made the cleanup less painful. Now I'll head on over to the feature request thread and ask for the option to have more than one line of data to be tested before a bulk import.

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 6:51 pm
by Jachym
Hi,
ok good.

The test import will only test first line, otherwise it would be unusable. If it was to test every line you would have thousands of lines to go through and check and that would first of all take time and second, the user wont do it, it would take them too much time too.

Re: Recovering from bad data import

Posted: Thu Nov 02, 2017 11:50 pm
by Luc
Hi Steve and Jachym,

Two remarks.

1. Your data import has an interval of 1 minute while MT default has an interval of 5 minutes.
So, after such an import you should run the database reformat script.

Code: Select all

http://www.mysite.com/install/reformat.php
BTW. It would be great when the import data program would test this and give a warning when the interval is not 5 minutes.

2. We could have some extra test build in the import programs that stops the execution when a line has a different number of fields than the previous line.

Cheers, Luc

Re: Recovering from bad data import

Posted: Fri Nov 03, 2017 12:13 am
by Jachym
Yes, or delete the "space" option altogether since "tab" looks for one or more spaces and will handle one space fine as well

Re: Recovering from bad data import

Posted: Fri Nov 03, 2017 12:47 pm
by SteveWx
Thanks again Luc, excellent suggestions, and I ran the reformat script this morning. I live in the woods and tend to get knocked off the grid a couple of times a year, and will certainly remember to use tab delimiting with WeatherDisplay from here on out. Although I keep thinking WeeWX on a Pi might be kind of fun.