-
Notifications
You must be signed in to change notification settings - Fork 0
Open
Description
When trying to import tables that contain special characters (in my case diacritics such as ç, á, é, etc.) the text is mangled and broken.
I'm trying to import the table found at https://cvmweb.cvm.gov.br/SWB/Sistemas/SCW/CPublica/CiaAb/ResultBuscaParticCiaAb.aspx?CNPJNome=&TipoConsult=C
For example, the first line already contains "2W ECOBANK S.A. - EM RECUPERAÇÃO JUDICIAL" that is turned into "2W ECOBANK S.A. - EM RECUPERA��O..."
Code snippet:
import DataFrames
import HTMLTables
import HTTP
const CVM_COMPANIES_URL = "https://cvmweb.cvm.gov.br/SWB/Sistemas/SCW/CPublica/CiaAb/ResultBuscaParticCiaAb.aspx?CNPJNome=&TipoConsult=C"
const CVM_COMPANIES_TABLE_ID = "dlCiasCdCVM"
const HEADERS = Dict(
"Accept" => "text/html",
"Accept-Encoding" => "gzip, deflate, br",
)
res = HTTP.get(CVM_COMPANIES_URL, headers=HEADERS)
df = HTMLTables.readtable(String(res.body), DataFrames.DataFrame, id=CVM_COMPANIES_TABLE_ID)
print(df)Metadata
Metadata
Assignees
Labels
No labels