Reading every file on it's own thread

Jar1 21 Reputation points
2022-04-19T09:35:04.593+00:00

Hi
I have a program that reads a bunch of files then does some parsing on the string and writes that to a new file.
But this is quite slow so I thought making it multithreaded where every file processed has it's own thread running.
I tried doing this with Thread class and Task class but the whole process actually got slower instead of faster.

Here's some code snippets of how I did it (With Task, C#):

// Start file processing
tasks.Add(Task.Run(() => handleScriptFile(file, destFilePath, getVars)));

// Wait all processing to be over
foreach (Task t in tasks)
{
Task.WaitAny(t);
}

What am i doing wrong?

thanks!

Developer technologies C#
0 comments No comments
{count} votes

Accepted answer
  1. Bruce (SqlWork.com) 77,686 Reputation points Volunteer Moderator
    2022-04-19T14:46:44.973+00:00

    Parallel.ForEach uses a thread pool, so you can limit the number of threads. If you were processing dozens of files it would not make a difference, but you really don’t want to create 100s or 1000s of threads, or more than actual cpu cores.

    List<t> is not concurrency safe. You need to wrap read/write access with locks.

    1 person found this answer helpful.
    0 comments No comments

4 additional answers

Sort by: Most helpful
  1. Castorix31 90,521 Reputation points
    2022-04-19T10:00:13.187+00:00

    You can see the answers from this thread Multi-threading - Reading multiple files
    (and a random sample from Google for Parallell : Using threads to process multiple image files)

    0 comments No comments

  2. Jar1 21 Reputation points
    2022-04-19T10:26:49.42+00:00

    The bottleneck is actually in the parsing code.. I disabled both read & write (disk) in the thread but that didn't improve the speed

    Not sure what to try next


  3. Jar1 21 Reputation points
    2022-04-19T12:33:24.663+00:00

    I found the problem, it was : fileProgress.Value (edit)
    that caused the slowness when multithreading
    :)

    0 comments No comments

  4. Jar1 21 Reputation points
    2022-04-19T13:02:52.767+00:00

    Still couple of questions

    1) Should I use Parallel.ForEach instead of Thread/Task , is there performance differences?

    2) Is using Add with List<T> thread safe?

    0 comments No comments

Your answer

Answers can be marked as Accepted Answers by the question author, which helps users to know the answer solved the author's problem.