<body><script type="text/javascript"> function setAttributeOnload(object, attribute, val) { if(window.addEventListener) { window.addEventListener('load', function(){ object[attribute] = val; }, false); } else { window.attachEvent('onload', function(){ object[attribute] = val; }); } } </script> <div id="navbar-iframe-container"></div> <script type="text/javascript" src="https://apis.google.com/js/plusone.js"></script> <script type="text/javascript"> gapi.load("gapi.iframes:gapi.iframes.style.bubble", function() { if (gapi.iframes && gapi.iframes.getContext) { gapi.iframes.getContext().openChild({ url: 'https://www.blogger.com/navbar.g?targetBlogID\x3d8211560\x26blogName\x3dTech+Tips,+Tricks+%26+Trivia\x26publishMode\x3dPUBLISH_MODE_BLOGSPOT\x26navbarType\x3dBLUE\x26layoutType\x3dCLASSIC\x26searchRoot\x3dhttp://mvark.blogspot.com/search\x26blogLocale\x3den\x26v\x3d2\x26homepageUrl\x3dhttp://mvark.blogspot.com/\x26vt\x3d-5147029996388199615', where: document.getElementById("navbar-iframe-container"), id: "navbar-iframe" }); } }); </script>

Tech Tips, Tricks & Trivia

by 'Anil' Radhakrishna
An architect's notes, experiments, discoveries and annotated bookmarks.

Search from over a hundred HOW TO articles, Tips and Tricks


HOW TO convert HTML content to plain text - with Excel!

There may be times when you need to extract just the text from a glob of HTML copied from the source as the content couldn't be copied or the text on the web page was hidden. Recently, I wanted to get the subtitles of a YouTube video, but it wasn't easy to copy it from the transcript. I couldn't also locate the timedtext file that contains the subtitles so I had to point at the Transcript block using Developer Tools (F12 keyboard shortcut) and get the HTML.

Here's the trick I tried -

Now that I had the text in HTML format, I copied it to Excel, selected Ctrl+H to invoke the Replace dialog box and in the Find What textbox I typed <*> and hit the Replace All button after leaving the Replace With textbox blank. That removed all the tags alongwith its attributes and left just the text.

Also see -
HOW TO strip HTML tags and show just web page text programmatically and with EditPlus

Labels: ,

Tweet this | Google+ it | Share on FB

« Home | Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »
| Next »

»

Post a Comment