Extract text from a file between two markers
- file
- extraction
A common approach to this is using a state machine that reads the text until the <START>
marker is encountered, then starts a “recording mode”, and extracts the text until the <END>
marker is encountered. This process can repeat if multiple sections may appear in the file and have to be extracted.
inRecordingMode = False
for line in file:
if not inRecordingMode:
if line.startswith('<START>'):
inRecordingMode = True
elif line.startswith('<END>'):
inRecordingMode = False
else:
yield line
For simple cases, this could also be solved with a regular expression.