It's either you use the Azure Event Grid, which can detect when a new file is uploaded to Azure Data Lake Storage and can trigger downstream processes like Azure Functions or Logic Apps.
Or Azure Functions since they are serverless compute services they can enable you to run event-driven code in response to a variety of events (in your case an event-driven processing on new document upload), The function can be triggered by the Event Grid and can process the uploaded PDF or Word document.
For splitting and reading PDFs, you can go for libraries like PyPDF2 (for Python) or PDFBox (for Java) can be used and for Word documents, python-docx (for Python) or Apache POI (for Java) are good choices.
After processing the documents, the split pages can be stored in Azure Blob Storage for further indexing or retrieval and once the document is split, you'd want to convert the content into vectors. If you want to use Azure-specific services, Azure Cognitive Search can be an option.
I think ADF might be overkill for your use case unless you have additional data transformation and integration needs that aren’t mentioned.