Join the 80,000 other DTN customers who enjoy the fastest, most reliable data available. There is no better value than DTN!

(Move your cursor to this area to pause scrolling)




"I have to tell you though that using the IQFeed API is about the easiest and cleanest I have seen for some time." - Comment from Jim
"If someone needs the best quality data and backfill beyond what their broker provides at a rate that is the best in the industry, I highly recommend IQFeed." - Comment from Josh via Public Forum
"IQ feed works very well, does not have all of the normal interruptions I have grown used to on *******" - Comment from Mark
"I had always used ******* but for the past 2 weeks have been trying DTN IQFeed. Customer support has been extraordinary. They call just to make sure your problem hasn't recurred." - Comment from Public Forum
"With HUGE volume on AAPL and RIMM for 2 days, everyone in a trading room was whining about freezes, crashes and lag with *******, RealTick, TS and Cyber. InvestorRT with IQFeed was rock solid. I mean SOLID!" - Comment from Public IRC Chat
"You have an excellent product !!!!!!" - Comment from Arely
"This is an excellent value, the system is generous (allowing for 500 stocks) and stable (and really is tick-by-tick), and the support is fantastic." - Comment from Shirin via Email
"As a past ******* customer(and not a happy one), IQ Feed by DTN is a much better and cheaper product with great customer support. I have had no problems at all since switching over." - Comment from Public Forum
"For anyone considering using DTN.IQ for a data feed, my experience with the quality of data and the tech support has been very positive." - Comment from Public Forum
"Can I get another account from you? I am tired of ******* going down so often" - Comment from George
Home  Search  Register  Login  Recent Posts

Information on DTN's Industries:
DTN Oil & Gas | DTN Trading | DTN Agriculture | DTN Weather
Follow DTNMarkets on Twitter
DTN.IQ/IQFeed on Twitter
DTN News and Analysis on Twitter
»Forums Index »Archive (2017 and earlier) »Data and Content Support »Corrupt data feed
Author Topic: Corrupt data feed (24 messages, Page 1 of 1)

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 10:11 AM          Msg. 1 of 24
Hi there,

I download historical data for 1 min granulars on a regular basis. I do it once a day, at the end of the day, around 6 or 8 PM EST. What I found is that for some stocks whose values (quotes) exceed a certain threshold which I can only sense (perhaps it is about $30.00) all data appears to be correct. The lower the stock value the more likely that a corrput, aberrant value will be thrown in. For instance for C (Citi Group) whose values have been under $4.0 lately such aberrant values appear almost daily.

There is some weird regularity in them. I will give a row of 15 dates. The bin where the wrong value appears is shown in parentheses (the timing is approximate because it is taken from a further 5-min granulation I do myself to show in a graph): 8/2/2010 (3:30); 8/3/2010 (3:30); 8/4/2010 (11:30); 8/5/2010 (no corruption); 8/6/2010 (3:30); 8/9/2010 (3:30); 8:10/2010 (3:30); 8:11/2010 (no corruption); 8/12/2010 (4:00); 8/13/2010 (3:30); 8/16/2010 (no corruption); 8/17/2010 (3:30); 8/18/2010 (no corruption); 8/19/2010 (no corruption); 8/20/2010 (12:00 & 3:30).

Here is a sample of actual data:

3.7399 3.74 3.73 3.735 1285472 2010-08-20 15:15:00.000
3.74 3.74 3.73 3.74 1623420 2010-08-20 15:16:00.000
3.7305 3.74 3.73 3.74 410455 2010-08-20 15:17:00.000
3.735 3.74 3.73 3.74 2070484 2010-08-20 15:18:00.000
3.74 3.74 3.73 3.7393 782217 2010-08-20 15:19:00.000
3.74 3.74 3.73 3.7305 250245 2010-08-20 15:20:00.000
33.02 33.025 32.99 32.992540927 2010-08-20 15:21:00.000
3.735 3.74 3.73 3.73 1652890 2010-08-20 15:22:00.000
3.73 3.74 3.73 3.73 444935 2010-08-20 15:23:00.000
3.7375 3.74 3.73 3.73 1920830 2010-08-20 15:24:00.000
3.73 3.74 3.73 3.7375 2893528 2010-08-20 15:25:00.000

I do all my programming myself from scratch. When I plot curves for my graphics those aberrant values greatly distort them because I have to squeeze maximum and minimum values into a rectangle and the program has no way of knowing that certain data is wrong. Thus the curve itself becomes diminutive and not helpful. Now I am trying to devise a routine that will through them out and replace them with the average of two bins on each side of the wrong value.

It is a painful process because there might be some unpredictable distortions and I've already gotten a taste of some of them. In short IQFeed data IS CORRUPT as far as I can tell. You must do something about it.

Thanks,

- Alex

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 10:45 AM          Msg. 2 of 24
This is a piece of daily data feed for GPS:

19.9 19.9 19.89 19.8925 1529 2010-02-17 14:08:00.000
19.89 19.9 19.89 19.9 1700 2010-02-17 14:09:00.000
19.9 19.92 19.89 19.92 19702 2010-02-17 14:10:00.000
540.3675 540.44 540.21 540.23 2564 2010-02-17 14:11:00.000
19.91 19.92 19.9 19.9 37151 2010-02-17 14:12:00.000
19.9 19.91 19.9 19.9 1475 2010-02-17 14:13:00.000

How did 540.36 come about?

AlexB

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 10:56 AM          Msg. 3 of 24
Alex. I assure you that the data corruption is not in the data itself. I have attached 1 min interval data for Citibank for 8/20 that I just pulled from the feed. You can see that there is no corruption in that data.

With that said, if the feed is delivering corrupt data, it must be getting corrupted somewhere in the transmission (either out of the servers, into/out of IQFeed, or into your app), or in your datastore.

Can you provide me with some details about how your app uses the feed? You mentioned you run downloads once a day. Do you issue a single request, wait for data, and then issue another request? Or do you issue multiple requests at the same time? Are the requests sent on different sockets to the feed or on the same socket? If you are sending multiple requests at the same time, how many? How many symbols are you downloading data for? How much data is requested for each symbol? What is the exact request that you are using for the requests (if each request is different, please tell me how you build the request)?

Also, can you give me some basic specs about the operating environment? CPU/RAM/OS as well as Internet speed and where your machine is physically loacted (State)?

Thanks.

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 11:00 AM          Msg. 4 of 24
Hmm...looks like it didn't add the attachment.



File Attached: C1MIN.txt (downloaded 1578 times)

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 11:01 AM          Msg. 5 of 24
Also, here is the download for GPS for that date. I included Minute data even though your post shows daily data because the data example you provided appears to be minute.



File Attached: GPS1MIN.txt (downloaded 1565 times)

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 11:36 AM          Msg. 6 of 24
Steve hi,

Thank you for the reply. I don't see any data you posted though. Answering your question though, this is what I do.

I have a list of about 43 stocks (symbols). My program starts with the first one (AAPL). First it checkes the availability of quotes in my Sql Server, by taking the last quote entered. Let say for tonight it will be a quote for last Friday around 7 PM or so. This dateTime is taken as the initial starting date for the IQFeed request. The end dateTime for the same request will be DateTime.Now which will be roughly 7 PM EST tonight. I use HIT command, shape it accordingly and send.

I can post the whole routine if you wish or send it to you via email. It is fairly large. As a matter of fact it is not one routine but about five including the one that saves the data in Sql Server.

What I found was that I could not make a smooth transition between packets, some Minutes are lost in between. In order to prevent loss of data I have another routine that checks missing bins and issues an extra request or requests to fill gaps which are small. This additional routine slows the execution somewhat, naturally, but it assures complete validity of data in terms of missing bins.

Keep it in mind that my system works extremely well for stocks whose values are over $30.00 or perhaps even $25.00. It makes me feel that it is not my problem.

I did not mention it in my OP but there are more corrupt data which I ignore. Data that preceeds normal trading hours is so gross that I loath even to look at those values. I have a routine that deletes those once in a while. There are weird quotes for 3:20 AM and so forth.

I don't quite get your question about the server. I send all my requests to the only server I know of. I got this information from you. I can post complete details if you wish. The server for me is a proxy server as a matter of fact. It is your API. I use local host and the port number 9100.

This is my way:

public static class CreateTcpSocket
{
public static void createTcpSocket ( )
{
// ReceiveTimeout 1000 - default is 0; mClient.Blocking Yes; BlockSource - unsupported
Socket mClient = new Socket ( AddressFamily.InterNetwork, SocketType.Stream, ProtocolType.Tcp );
mClient.ReceiveTimeout = 1000; // default is 0
System.Net.IPAddress ipAdd = System.Net.IPAddress.Parse ( "127.0.0.1" );
int port = Int32.Parse ( "9100" );
IPEndPoint endPoint = new IPEndPoint ( ipAdd, port );
Globals.mClient = mClient;
mClient.Connect ( endPoint );
} // createTcpSocket
}

Thanks, - Alex

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 11:38 AM          Msg. 7 of 24
OK, I apologize. I do see the quotes in the attachment.

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 11:41 AM          Msg. 8 of 24
Yep, it is weird. Your data appears to be all right. Then what is my problem?

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 11:49 AM          Msg. 9 of 24
OK, I definitely use only one socket. I send one request at a time. The request is per symbol. It spans the time frame I need. If I find any data is missing because of my inability, as I mentioned, to arrange for a smooth transition between packets your server sends, then after the data is collected into my internal DataTable I issue another request thru the same socket. That request is for the same symbol and for a narrower time frame (span). There might be a few such requests perhaps per symbol but I have no way of knowing. I can measure them though.

I see that I somewhat misunderstood your post at first. Now I am trying to clarify details.

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 12:00 PM          Msg. 10 of 24
OK, I am posting the initial routine that is a starting point for the whole chain. You will see that the names all aroung are Hist_Ameritrade, not IQFeed. It is because I designed the whole system originally for downloading TD-Ameritrade history data but then switched to IQFeed:

private void UpdateAllSymbolsHist_OHLC_Ameritrade ( )
{
Console.WriteLine ( );
Console.WriteLine ( "Hist_OHLC_Ameritrade Downloads: \r\n" );
Console.WriteLine ( );
string[ ] symbols = FindAllSymbolsInDB.FindAllTableSymbolsInDataBase ( "Hist_OHLC_Ameritrade" );
foreach ( string symbol in symbols )
{
Globals.currentSymbol = symbol;
Globals.currentDate = DateTime.Now;
DateTime latestDate = FindDates.FindEarliestOrLatestDate ( symbol, "Hist_OHLC_Ameritrade", "MAX" );
CreateTableHistAmer ( Globals.currentSymbol );
string cEndDate = DateTime.Now.ToString ( "yyyyMMdd HHmmss" );
string cStartDate = latestDate.ToString ( "yyyyMMdd HHmmss" );
Console.WriteLine ( symbol + " " + latestDate.ToString ( "yyyy-MM-dd HH:mm:ss" ) + " - " +
DateTime.Now.ToString ( "yyyy-MM-dd HH:mm:ss" ) );
string sHistRequest = "HIT," + Globals.currentSymbol + ",1," + cStartDate + "," + cEndDate + ",,,,1\r\n";
byte[ ] sendBytes = null;
sendBytes = Encoding.ASCII.GetBytes ( sHistRequest );
stream.Write ( sendBytes, 0, sendBytes.GetLength ( 0 ) );
ReceiveData ( );
}
} // UpdateAllSymbolsHist_OHLC_Ameritrade

I used your code I believe although I am not quite clear on this point. Most of your code must be in ReceiveData, I guess.

I want to stress again and again, that for any symbol with values over about $25.00 I never have any problem.

Thanks.


AlexB
Edited by alexB on Aug 23, 2010 at 12:01 PM

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 01:27 PM          Msg. 11 of 24
Thanks for the info Alex. I don't see anything necessarily wrong with the request generation code you have posted.

Can you send me a complete symbol list that you are using? Knowing that might help us track down where the data is getting corrupted.

For example, it looks like the GPS request you posted above has some data mixed in from GOOG. The data matches GOOG's 1min data for that timeframe exactly. Do you happen to keep a log of what requests you are sending to the feed? If so, it would be interesting to know if you sent a GOOG request immediately before or after the GPS request.

If you would like to email the rest of the code to developer support or send it to me in a Private message here on the forums, I will take a scan through it to see if I can find anything you are doing that would be obviously wrong.

Basically what this comes down to is that you shouldn't be having the difficulties you are having. Unless I am misunderstanding what your app is doing (and I don't think I am), what you have described should be a fairly straight forward process and you shouldn't be needing to issue extra requests to fill in missing data or account for corruption. We aren't recieving reports from other 3rd party developers (or even from regular customers) about this either.

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 02:49 PM          Msg. 12 of 24
Thank you Steve,

It gives me hope. Your email address must be -edited to remove address- correct? I will send you my code.

This is a complete list of my symbols, the ones that are used to download historical data:

AAPL ADP AIG APC ATVI BA BAC BHP BIDU BP BRCM C COP CSCO DELL DLTR EBAY ERTS F GOOG GPS GS HPQ IBM INTC INTU JNJ KLAC MCD MSFT QQQQ RIMM RMBS RSH SBUX T UUP XLB XLE XLF XLI XLP XOM YHOO

This is the order they are used in requests. It is strictly alphabetical. You are right, GPS and GOOG are next to each other.

I will try to email to you some of my routines, that are conencted with the one I posted. If I missed any, please let me know.

I appreciate your help. I hope looking at my mess with a fresh eye will uncover the bug.

Thanks.



AlexB
Edited by DTN_Steve_S on Aug 23, 2010 at 02:51 PM
Edited by alexB on Aug 23, 2010 at 03:10 PM

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 02:58 PM          Msg. 13 of 24
Alex, that is the correct address but I removed it from your post.

I looked at the historical data for BRCM (the symbol immediately prior to C in your list) and once again, the values for 8/20 @ 15:21 matches exactly what you reported for C at that time.

It would seem to me that the data is not getting cleared out properly from one request to the next and is somehow getting merged together.

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 03:12 PM          Msg. 14 of 24
It is an amazing dicovery. How could it be? I cannot picture the process that leads to it. Very strange.

Thanks.

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 23, 2010 03:19 PM          Msg. 15 of 24
Steve, I don't know if it has any relevance to the subject in question but I also observe the stocks in Quote Tracker very nice pictures/graphics. I have 30-inch screens on my monitors and have all those 40+ stocks displayed. Currently I use IQFeed but the thing I am about to describe is not unique to IQFeed. I observed it also with TD-Ameritrade feed. Once in a while the values of some stocks jump unreasonably high or less frequently drop. The changed value stays for a minute or two and then comes back to baseline. It reminds me what I see in my tables. QT obviously is not connected to my application.


AlexB
Edited by alexB on Aug 23, 2010 at 03:20 PM

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 23, 2010 04:53 PM          Msg. 16 of 24
Alex, I haven't yet recieved an email from you. Perhaps it got caught by our email filters. If you did already send it, please make sure and zip any attachments and then rename the file extention on the zip file to be .ChangeToZip and resend.

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 25, 2010 07:39 AM          Msg. 17 of 24
Steve hi,

Yesterday I did what you asked me to. I zipped one of my major files in the project (Form1), changed the extension to .ChangeToZip and emailed it to you. Have you gotten it?

Thanks.

AlexB

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 25, 2010 10:45 AM          Msg. 18 of 24
Alex, I have received the email. I am looking into this.

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 25, 2010 11:30 AM          Msg. 19 of 24
Steve hi,

In the previous emails I sent to you I also gave you a briefing as to how to view it. It is my working file which obviously contains a bug somewhere. Also you should keep in mind that this application performs other tasks and it clouds the things you will have to look into. I did not want to edit because I thought it could castrate the code you really need. Some parts may still be outside Form1. If you found something missing, please let me know, I will send those parts as well.

I also want to say that I appreciate your help and I hope a fresh eye of an expert may uncover the problem that's been bugging me.

Many thanks, - Alex

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 26, 2010 07:57 AM          Msg. 20 of 24
Steve hi,

I downloaded symbols yesterday after making changes you suggested. I hope I made them all at the right place. Still I beliveve the problem persists. I found an errant value in F (Ford) at 15:30, a typical time when such values occur. The stock preceding it is ERTS and when I looked into it I found for the first time that that bin was missing.

These are some F quotes:

11.309 11.32 11.3 11.315 60803 2010-08-25 15:27:00.000
11.31 11.32 11.31 11.32 51536 2010-08-25 15:28:00.000
11.3125 11.32 11.31 11.31 63769 2010-08-25 15:29:00.000
15.18 15.18 15.17 15.18 29753 2010-08-25 15:30:00.000
11.315 11.315 11.3 11.3093 66205 2010-08-25 15:31:00.000
11.3 11.34 11.3 11.33 248474 2010-08-25 15:32:00.000
11.34 11.36 11.33 11.36 257487 2010-08-25 15:33:00.000
11.355 11.36 11.34 11.345 204872 2010-08-25 15:34:00.000

These are ERTS quotes.

15.21 15.22 15.2 15.2 4900 2010-08-25 15:19:00.000
15.2 15.2075 15.19 15.2 4300 2010-08-25 15:20:00.000
23.1 23.105 23.09 23.09 29145 2010-08-25 15:21:00.000
15.19 15.19 15.16 15.18 26927 2010-08-25 15:22:00.000
15.18 15.18 15.16 15.17 15089 2010-08-25 15:23:00.000
15.17 15.17 15.17 15.17 1300 2010-08-25 15:24:00.000
15.17 15.17 15.17 15.17 1000 2010-08-25 15:25:00.000
15.1671 15.18 15.16 15.18 10564 2010-08-25 15:26:00.000
15.18 15.18 15.17 15.1735 1236 2010-08-25 15:27:00.000
15.18 15.18 15.17 15.17 3400 2010-08-25 15:28:00.000
15.18 15.18 15.17 15.18 2000 2010-08-25 15:29:00.000
15.17 15.19 15.17 15.19 48299 2010-08-25 15:31:00.000
15.19 15.23 15.19 15.23 36063 2010-08-25 15:32:00.000
15.23 15.24 15.23 15.24 9738 2010-08-25 15:33:00.000
15.24 15.25 15.24 15.24 37274 2010-08-25 15:34:00.000

You can see that ERTS has one line missing as well as an errant quote prsent at 15:21.

Would you take another look into my code? I am sending to you an email with the modified Form1 class. Also tell me if you need other classes.

Thanks.

AlexB

DTN_Steve_S
-DTN Guru-
Posts: 2093
Joined: Nov 21, 2005


Posted: Aug 26, 2010 08:31 AM          Msg. 21 of 24
Alex, I looked at the email and it looks like you didn't fix the first issue I found that was causing data to not be parsed correctly (you only fixed the second issue).

Hopefully addressing that issue will fix your issue entirely. However, if it doesn't, I am not sure how much more help I can be on the issue. You should be able to insert a Debug statement in your processing code to verify that the data is being recieved and parsed correctly from the feed. At that point, I can't offer any help as to why it would be getting corrupted in your data store.
Edited by DTN_Steve_S on Aug 26, 2010 at 08:31 AM

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 26, 2010 10:15 AM          Msg. 22 of 24
Steve hi,

Thank you very much. Inattention is my curse. I fixed the first issue and will try it again tonight. I appreciate your help very much.

AlexB

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 27, 2010 07:38 AM          Msg. 23 of 24
Steve hi,

I downloaded a new set of minute quotes last night and checked a few stocks today, especially the ones that were corrupted before.

It appears you've cured the problem. Still I will need a few more days of observation to confirm but as of now I am very encouraged. Also this weekend I will try to review the code myself and try to analyze it with extensive printouts of lines at the juncture of two packets while they are coming from the cocket.

Again, thank you very much. This is what a fresh look is capable of doing.

P.S. Tonight (Friday August 27, 2010) is the second night of clean download. Everything is working perfectly well.



AlexB
Edited by alexB on Aug 27, 2010 at 04:42 PM

alexB
-Interested User-
Posts: 75
Joined: Dec 19, 2009


Posted: Aug 28, 2010 11:38 AM          Msg. 24 of 24
Steve hi,

I just finished an extensive test. I deleted some of my old historical records going abck to Jan 1, 2010 which were definitely corrupt. I saw the corruption on my graphics. I then re-downloaded the data and lo and behold the corruption disappeared completely.

I consider the problem solved. Again, I want to say that I am very greatful to you for such a quick and sharp spotting of those two bugs.

Many thanks, - Alex

AlexB
 

 

Time: Sun April 28, 2024 12:12 AM CFBB v1.2.0 21 ms.
© AderSoftware 2002-2003