How to remove the header and footer of web page programmaticaly
Apr 7th, 2008 23:26
ha mo, Colin Fraser, Rakesh Sharma,
Try using a stripper program, but you may have to write it write
yourself. Firstly, identify the tags you want to get rid of, like <head>
and </head> determine how many characters there are between them and
select the lot, then replace them with an alternate <head> and </head>.
For the footer, well that is a bit trickier, you have to determine if
the page has a footer, but as everyone has their own idea of what makes
up a footer, eg. a <span> or a <table> or a <p style="footer"> or any
one of a million or so other variations, then you have to do the same as
you did above.
Of course, all this is a complete nonsense, because you have to download
the page first, then open it in your stripper program window. So there
is no advantage for you in doing this.
Having said all that, you might try this:
Do a right click on the opened web page, and click "View Source". Find
the beginning of the content, select it to the end of the content, and
then copy it into your computer's clipboard. Using a new window in your
favourite text editor, paste the contents of the clipboard to that
window and save it as whatever you want.
This is simple and if you are loooking at a small set of pages, you may
be able to copy everything and save it as one page.
This method saves time and lots of aggravation, or so I found as a
student, and you get something you may want.