@johananmahendran Here are the answers to your queries
How is the prompt fed into the GPT model?
It is fed as-is passed by you in your response. You are responsible for ensuring the prompt fits within the token limit for the model or use models with higher limits. Most models have a default of 4k tokens but there are 8k and 16k models that you can experiment with as well.
However, does it also stream text to the OpenAI server in a similar way? Or does it send everything all at once?
The model requires the complete prompt to start the "completion". So it "must" be sent in full in order to start the response.