Getting HTML tags from HTML gadget?

BlitzPlus Forums/BlitzPlus Programming/Getting HTML tags from HTML gadget?

*(Posted 2004) [#1]
Anyone know how to get the HTML tags etc from a HTML view?

I know I posted this in the Blitz3d thread (unfortunately I use that most often ;) )


turtle1776(Posted 2004) [#2]
(bump)


soja(Posted 2004) [#3]
I don't think you can.

You can use functions in the WinAPI to query the HTTP page and stuff, but it's harder, and you have to manage your own timeouts and stuff, which I think you need multi-threading or callbacks for or something. (I didn't get that far though, so I may be blowing smoke.)

I did something like this a long time back (except for the timeout stuff). I can post something if you really want when I get home.


*(Posted 2004) [#4]
That would be very helpful soja :)

I would personally like a B+ system like the Text Area gadget so you can 'retrieve' lines of html from the gadget.


soja(Posted 2004) [#5]
Here is what I used to get the HTML of a page. It actually queries the page itself (as the HTMLView gadget would do), but instead of displaying it, stores it in "page". Once received, you can then parse the page to get whatever you wanted (including HTML tags).

I wrote ths quite a long time ago and haven't looked at it since. I know you could probably change the banks to other var/custom types, and there may be other problems. One issue I had, if I remembe right, is that if the internet connection is faulty, it doesn't timeout, or takes forever... I think.
;http://msdn.microsoft.com/library/default.asp?url=/library/en-us/wininet/wininet/internetconnect.asp

.lib "wininet.dll"
InternetOpen%(Agent$, AccessType%, ProxyName%, ProxyBypass%, Flags%):"InternetOpenA"
InternetCloseHandle%(hInternet%):"InternetCloseHandle"
InternetConnect%(hInternet%, ServerName$, ServerPort$, Username$, Password%, Service%, Flags%, dwContext$):"InternetConnectA"
HttpOpenRequest%(hConnect%, Verb$, ObjectName$, Version%, Referer%, AcceptTypes$, Flags%, Context$):"HttpOpenRequestA"
InternetOpenURL%(hInternet%, URL$, Headers%, HeaderLen%, Flags%, Context%):"InternetOpenUrlA"
InternetReadFile%(hFile%, out_lpBuffer*, NumBytesToRead%, out_NumRead*):"InternetReadFile"

	Local hInternet%, hURL%, bDone%
	Local out% = CreateBank(2048)
	Local bytesRead% = CreateBank(4)
	Local Url$ = "http://..."

	hInternet = InternetOpen("...", 0, 0, 0, 0)
	If hInternet Then
		hURL = InternetOpenURL(hInternet, Url, 0, 0, $84000000, 0)
		If hURL Then
			Repeat
				InternetReadFile(hURL, out, BankSize(out), bytesRead)
				For i = 0 To PeekInt(bytesRead, 0) - 1: page = page + Chr$(PeekByte(out, i)) : Next
			Until PeekInt(bytesRead, 0) = 0
			InternetCloseHandle(hURL)
		Else
			Notify "Error opening page:"+Chr$(10)+Url, True
		EndIf
		InternetCloseHandle(hInternet)
	Else
		Notify "Error connecting to the Internet.  Aborting.", True
	EndIf



*(Posted 2004) [#6]
thanx will have a look :)


turtle1776(Posted 2004) [#7]
Thanks a lot Soja. I've been playing with a native Blitz version using Mark Sibly's HTTP Get code and your code, and yours is something like 30 times faster.

Here are two versions of the same function that download an url to disk.

;This function downloads a web page to file. It is based on Soja's code (see above)

;.lib "wininet.dll" 
;InternetOpen%(Agent$, AccessType%, ProxyName%, ;ProxyBypass%, Flags%):"InternetOpenA" 
;InternetCloseHandle%(hInternet%):"InternetCloseHandle" 
;InternetConnect%(hInternet%, ServerName$, ServerPort$, Username$, Password%, Service%, Flags%, dwContext$):"InternetConnectA" 
;HttpOpenRequest%(hConnect%, Verb$, ObjectName$, Version%, Referer%, AcceptTypes$, Flags%, Context$):"HttpOpenRequestA" 
;InternetOpenURL%(hInternet%, URL$, Headers%, HeaderLen%, Flags%, Context%):"InternetOpenUrlA" 
;InternetReadFile%(hFile%, out_lpBuffer*, NumBytesToRead%, out_NumRead*):"InternetReadFile" 

Function DownloadWebPage(url$,savedFile$)

	out% = CreateBank(2048) 
	bytesRead% = CreateBank(4) 

	;Download the web page
	hInternet = InternetOpen("...", 0, 0, 0, 0) 
	If hInternet Then 
		hURL = InternetOpenURL(hInternet, url$, 0, 0, $84000000, 0) 
		If hURL 
			Repeat
				InternetReadFile(hURL, out, BankSize(out), bytesRead) 
				For i = 0 To PeekInt(bytesRead, 0) - 1
					page$ = page$ + Chr$(PeekByte(out, i))
				Next
			Until PeekInt(bytesRead, 0) = 0 
			InternetCloseHandle(hURL) 
		Else
			Notify "Error opening page:"+Chr$(10)+Url, True 
			FreeBank out : FreeBank bytesRead
			Return failed
		End If	
	Else 
		Notify "Error connecting to the Internet. Aborting.", True 
		FreeBank out : FreeBank bytesRead
		Return failed
	EndIf
	FreeBank out : FreeBank bytesRead 
	
	;Write it to file
	fileout = WriteFile(savedFile$)
	If fileout
		WriteLine fileout, page$
		CloseFile (fileout)
		Return succeeded
	Else
		Return failed
	End If					
End Function 



;This function downloads a web page to file. Unfortunately,
;it is a bit slow. It is based on Mark Sibly's HTTP Get.
Function DownloadWebPage2(url$,savedFile$)

	; Split into hostname and file to download
	If Left (url$, 7) = "http://" Then url$ = Right (url$, Len (url$) - 7)	
	slash = Instr (url$, "/")
	If slash
		webHost$ = Left (url$, slash-1)
		webFile$ = Right (url$, Len (url$) - slash + 1)
	Else
		webHost$ = url$
		webFile$ = "/"
	EndIf
	
	;Open the connection to the web page
	www=OpenTCPStream(webHost$,80 )
	If Not www Then Notify "Can not connect to web server" : Return failed
	WriteLine www, "GET " + webFile$ + " HTTP/1.1" ; GET / gets default page...
	WriteLine www, "Host: " + webHost$
	WriteLine www,"User-Agent: Directory Builder"
	WriteLine www,"Accept: */*"
	WriteLine www,""

	;Download it to file 1 line at a time (slow)
	fileout = WriteFile(savedFile$)
	If fileout
		While Not Eof(www)
		    WriteLine fileout, ReadLine (www)
		Wend
		CloseFile (fileout)
	Else
		CloseTCPStream www	
		Return failed	
	End If
	
	;Close the TCP stream and exit.
	CloseTCPStream www	
	Return succeeded	
End Function