In my opinion, these types of queries should be done in the database. I use Linq for small datasets that are already in memory. Databases where designed specifically for querying data efficiently .
VB.NET - Linq is too slow to handle a large amount of data stored in list(of T)

Hello,
I have a list(of T) of more than 20,000 data, I want to group them by slices of hours, minutes, etc. To do a CandelStick Charting.
In this slice, we must collect the max, the min, the start value of the slice and the end value of the slice.
It takes me 1.7 sec to process 20,000 data. There is 1 data every second.
In the example below, I group them by minutes.
Here is the class containing the data type.
Public Class CPrice
Sub New()
End Sub
Private _Date As DateTime = Now
Public Property [Date] As DateTime
Get
Return _Date
End Get
Set(ByVal value As DateTime)
_Date = value
End Set
End Property
Private _Prix As Decimal =0
Public Property Prix As Decimal
Get
Return _Prix
End Get
Set(ByVal value As Decimal)
_Prix = value
End Set
End Property
End Class
The routine for regrouping. (here by 60 sec.), does it in 1.7 sec.
Function CandelCreate(ByVal Pr As List(Of CPrice)) As DataPointCollection
Dim IntervalSec As Long = 60
Dim DateMin As DateTime = Pr(0).Date
Dim DateCpt As DateTime = DateMin
Dim RetSerie As New Series("Ret")
RetSerie.YValuesPerPoint = 4
Dim DateDepart As DateTime = New Date(Year(DateMin), Month(DateMin), DateAndTime.Day(DateMin), Hour(DateMin), Minute(DateMin), 0)
DateCpt = DateDepart
Do
Dim Res0 = Pr.Where(Function(d) d.Date >= DateCpt And d.Date < DateCpt.AddSeconds(IntervalSec)).Select(Function(x) x.Prix)
Dim MaxInInterVal As Decimal = Res0.Max
Dim MinInInterVal As Decimal = Res0.Min
Dim OpenInInterval As Decimal = Res0.First
Dim CloseInInterval As Decimal = Res0.Last
RetSerie.Points.AddXY(DateCpt, MaxInInterVal, MinInInterVal, OpenInInterval, CloseInInterval)
DateCpt = DateCpt.AddSeconds(IntervalSec)
Loop While DateCpt > DateDepart And DateCpt <= Pr(Pr.Count - 1).Date
Return RetSerie.Points
End Function
Is there a way to speed up the processing?
Thanks for your help.
Steeve
2 answers
Sort by: Most helpful
-
-
Bruce (SqlWork.com) 36,181 Reputation points
2022-10-28T15:13:44.627+00:00 You are iterating thru the 20k list for every interval. I’d try one pass thru the list and using a keyed hash calc the intervals. Or linq group by interval. I tried a sample group by and it ran 200K rows in 32 milliseconds
(C#, don't know vb anymore):
public List<DataPoint> CandelCreate(List<CPrice> list, int interval) => list.GroupBy( r => Math.Floor((r.Date - DateTime.MinValue).TotalSeconds / (double) interval), r => r.Pric, (k, p) => new DataPoint { IntervalStart = DateTime.MinValue.AddSeconds(k * interval), Max = p.Max(), Min = p.Min(), First = p.First(), Last = p.Last() }).ToList();