processTextInChunks
Val Town is a collaborative website to build and scale JavaScript apps.
Deploy APIs, crons, & store data – all from the browser, and deployed in miliseconds.
This JavaScript function processes long text inputs by breaking them into smaller chunks, applying a custom processing function to each chunk, and optionally handling overlaps between chunks.
processTextInChunks(inputText, processFn, config)
inputText
(string): The long input text to be processed.processFn
(function, optional): A function to process each chunk. Defaults to logging the chunk.config
(object, optional): Configuration options.
chunkLength
(number): Length of each chunk. Default: 1000.overlapLength
(number): Length of overlap between chunks. Default: 0.
const longText = "Your very long text here...";
const result = processTextInChunks(longText,
(mainChunk, overlapChunk, info) => {
// Process the chunk here
console.log(`Processing chunk from ${info.startIndex} to ${info.endIndex}`);
console.log(`Main chunk: ${mainChunk}`);
console.log(`Overlap: ${overlapChunk}`);
return mainChunk.length; // Example return value
},
{
chunkLength: 500,
overlapLength: 50
}
);
console.log(result);
- If
processFn
is not provided, the function will log each chunk and return the original text. - If
processFn
returns a value, these values will be collected in an array and returned. - If
processFn
doesn't return anything, the original input text is returned. - The
processFn
receives three arguments:mainChunk
: The current chunk of text being processed.overlapChunk
: The overlapping text from the previous chunk (empty for the first chunk).- An info object containing:
startIndex
: Start index of the current chunk in the original text.endIndex
: End index of the current chunk in the original text.isLastChunk
: Boolean indicating if this is the last chunk.
- Processing large texts in smaller, manageable pieces.
- Applying transformations or analysis to text while maintaining context through overlaps.
- Tokenization or parsing of large documents.
Migrated from folder: Libraries/processTextInChunks/processTextInChunks