About me

Tuesday, April 12, 2011

Strip off Html tags from excel


When data is exported from any web page into excel or from any other sources into excel. It gets copied with html tags. This is making it difficult to read it. If there are few rows then we can remove them manually but if we there are more number of rows then we can go for below steps.

Follow these steps:

1.    Open up the VBE with Alt+F11.
2.    Locate your workbook in the top left Project Explorer window. Right-click on your workbook and choose Insert>Module.
3.    Paste the below given code in to the code module that will open.
4.    Go back to Excel, select the cells you want to clean and activate the macro with Tools>Macro>Macros. (Macro can also be activated by clicking Run icon in Visual Basic).
5.    All the html tags will be replaced.
6.    But sometimes some cells will be displayed as hashes. In such cases, select the Description column again and select Home > Clear > Clear Formats. This will get back your description.





Code :

Sub RemoveTags()
Dim r As Range
Selection.NumberFormat = "@"  'set cells to text numberformat
With CreateObject("vbscript.regexp")
  .Pattern = "\<.*?\>"
  .Global = True
  For Each r In Selection
    r.Value = Replace(.Replace(r.Value, ""), " ", " ")
    r.Value = Replace(.Replace(r.Value, ""), "<", "<")
    r.Value = Replace(.Replace(r.Value, ""), ">", ">")
  Next r
End With
End Sub

2 comments:

apgp88 said...

Awesome stuff !
Is there any way to avoid hashes ?

Anonymous said...

Thanks for adding step #6...I was struggling with this and have finally finished!

Post a Comment