Converting Text File To Html File With Python

July 31, 2024 Post a Comment

I have a text file that contains : JavaScript 0 /AA 0 OpenAction 1 AcroForm 0 JBIG2Decode 0 RichMedia

Solution 1:

Just change your code to include <pre> and </pre> tags to ensure that your text stays formatted the way you have formatted it in your original text file.

contents =open"C:\\Users\\Suleiman JK\\Desktop\\Static_hash\\test","r")
with open("suleiman.html", "w") as e:
    for lines in contents.readlines():
        e.write("<pre>"+ lines +"</pre> <br>\n")

Solution 2:

This is HTML -- use BeautifulSoup

from bs4 import BeautifulSoup

soup = BeautifulSoup()
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('path/to/input/file.txt') as infile:
    for line in infile:
        row = soup.new_tag('tr')
        col1, col2 = line.split()
        for coltext in (col2, col1): # important that you reverse order
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('path/to/output/file.html', 'w') as outfile:
    outfile.write(soup.prettify())

Solution 3:

That is because HTML parsers collapse all whitespace. There are two ways you could do it (well probably many more).

One would be to flag it as "preformatted text" by putting it in <pre>...</pre> tags.

The other would be a table (and this is what a table is made for):

<table>
  <tr><td>Javascript</td><td>0</td></tr>
  ...
</table>

Fairly tedious to type out by hand, but easy to generate from your script. Something like this should work:

contents =open("C:\\Users\\Suleiman JK\\Desktop\\Static_hash\\test","r")
with open("suleiman.html", "w") as e:
    e.write("<table>\n")   
    for lines in contents.readlines():
        e.write("<tr><td>%s</td><td>%s</td></tr>\n"%lines.split())
    e.write("</table>\n")

Solution 4:

You can use a standalone template library like mako or jinja. Here is an example with jinja:

from jinja2 import Template
c = '''<!doctype html><html><head><title>My Title</title></head><body><table><thead><tr><th>Col 1</th><th>Col 2</th></tr></thead><tbody>
       {% for col1, col2 in lines %}
       <tr><td>{{ col 1}}</td><td>{{ col2 }}</td></tr>
       {% endfor %}
   </tbody></table></body></html>'''

t = Template(c)

lines = []

with open('yourfile.txt', 'r') as f:
    for line in f:
        lines.append(line.split())

with open('results.html', 'w') as f:
    f.write(t.render(lines=lines))

If you can't install jinja, then here is an alternative:

header = '<!doctyle html><html><head><title>My Title</title></head><body>'
body = '<table><thead><tr><th>Col 1</th><th>Col 2</th></tr>'
footer = '</table></body></html>'

with open('input.txt', 'r') as input, open('output.html', 'w') as output:
   output.writeln(header)
   output.writeln(body)
   for line in input:
       col1, col2 = line.rstrip().split()
       output.write('<tr><td>{}</td><td>{}</td></tr>\n'.format(col1, col2))
   output.write(footer)

Solution 5:

I have added title, looping here line by line and appending each line on < tr > and < td > tags, it is should work as single table without column. No need to use these tags(< tr >< /tr > and < td >< /td >[gave a spaces for readability]) for col1 and col2.

log: snippet:

MUTHU PAGE
2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118 INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48 INFO | ***** User SIGNUP page start ***** 2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49 INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

html source page:

'''

<?xml version="1.0" encoding="utf-8"?><body><table><p>
   MUTHU PAGE
  </p><tr><td>
    2019/08/19 19:59:25 MUTHUKUMAR_TIME_DATE,line: 118     INFO | Logger object created for: MUTHUKUMAR_APP_USER_SIGNUP_LOG
   </td></tr><tr><td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 48     INFO | ***** User SIGNUP page start *****
   </td></tr><tr><td>
    2019/08/19 19:59:25 MUTHUKUMAR_DB_USER_SIGN_UP,line: 49     INFO | Enter first name: [Alphabet character only allowed, minimum 3 character to maximum 20 chracter]

'''

CODE:

from bs4 import BeautifulSoup

soup = BeautifulSoup(features='xml')
body = soup.new_tag('body')
soup.insert(0, body)
table = soup.new_tag('table')
body.insert(0, table)

with open('C:\\Users\xxxxx\\Documents\\Latest_24_may_2019\\New_27_jun_2019\\DB\\log\\input.txt') as infile:
    title_s = soup.new_tag('p')
    title_s.string = " MUTHU PAGE "table.insert(0, title_s)
    for line in infile:
        row = soup.new_tag('tr')
        col1 = list(line.split('\n'))
        col1 = [ each for each in col1 if each != '']
        for coltext in col1:
            col = soup.new_tag('td')
            col.string = coltext
            row.insert(0, col)
        table.insert(len(table.contents), row)

with open('C:\\Users\xxxx\\Documents\\Latest_24_may_2019\\New_27_jun_2019\\DB\\log\\output.html', 'w') as outfile:
    outfile.write(soup.prettify())

Html5 College