HOW TO convert HTML content to plain text - with Excel!
There may be times when you need to extract just the text from a glob of HTML copied from the source as the content couldn't be copied or the text on the web page was hidden. Recently, I wanted to get the subtitles of a YouTube video, but it wasn't easy to copy it from the transcript. I couldn't also locate the timedtext file that contains the subtitles so I had to point at the Transcript block using Developer Tools (F12 keyboard shortcut) and get the HTML.
Here's the trick I tried -
Now that I had the text in HTML format, I copied it to Excel, selected Ctrl+H to invoke the Replace dialog box and in the Find What textbox I typed <*> and hit the Replace All button after leaving the Replace With textbox blank.
Here's the trick I tried -
Now that I had the text in HTML format, I copied it to Excel, selected Ctrl+H to invoke the Replace dialog box and in the Find What textbox I typed <*> and hit the Replace All button after leaving the Replace With textbox blank.
That removed all the tags along with its attributes and left just the text.
Comments
Post a Comment