Are you ready to switch to HTML parsing permanently?

83.3% 65
1.3% 1
15.4% 12

78 Date 2010-04-14 00:49

Forums / Cotonti / Development / Poll: A global switch to HTML parsing

<<<12345>>>

Are you ready?

GHengeveld
#16 2010-04-16 18:33
I recently rediscovered a php function strip_tags() which will strip out all html and php tags. It has an option allowable_tags which is a list of html tags that will not be stripped. We can use this to allow only tags like <b> to be used. Of course we'd have to put it through htmlpurifier first to filter anything malicious. To allow posting blocks of unparsed code we'd have to run htmlentities over this part first (unless htmlentities does that for us). I suggest using the <pre> tag for this.
Kilandor
#17 2010-04-16 20:32
Honestly nothing should change with BBcode at all.

BBcode IMO was created for 2 reason.
  • To prevent and stop malicious injections of html.(course sometimes this isn't always the case with malformed bbcode)
  • For ease of the end user. Anyone can figure out BBcode in a couple of minutes and is made to be user-friendly and for ease of use.(while HTML is not)

If you set 2 average people down, show them BBCode to make a message, and then HTML Most would go straight for the BBcode and understand it easier I think.

Example From a Game Site I'm running
  • [item]120000474[/item]
  • [item]>Bloody Flame Earrings[/item]
  • <a class="aiondb-item-full-small" href="http://www.aionarmory.com/item.aspx?id=120000474">Bloody Flame Earrings</a>
Depending on if you use ID or name, it queries to get the opposite name/id, even without that if it straight output the link. Honestly which would you rather do or expect your users to use?

Even personally when working on forums and such. Prefer using BBcode over HTML.

Yes there are editors out there that can handle it, and do that for you. I think it would be a huge loss to remove BBbcode. Alot of times they also make a mess out of the html/bbcode.

What should be done is like most all cms/forums do. Have an option that turns on limited HTML parsing for people who want to use it. To allow some HTML tags to be entered directly.

The only option I could go for is to have multiple parsers. A global switch would be horrific to goto HTML only. I honestly don't think I could ever justify upgrading Cotonti if such a switch took place.
This post was edited by Kilandor (2010-04-16 22:30, 14 years ago)
urlkiller
#18 2010-04-16 22:16
i think a system where you could choose what you want would be best.

users should have a way to easy insert html as well as bbcodes anywhere on the site...
plus i would really like it to have more freedom when designing pages this way you can do more the easy way....
URL shortener: <a href="http://bbm.li/!7AD5C7">http://bbm.li/!7AD5C7</a>
Kort Online
#19 2010-04-16 22:16

BBCodes / ugly typography are killing Cotonti. Nobody needs secure but ugly CMS.
All above ideas & suggestions make very little sense because you're mostly talking about malicious code threats via contributors and contributions. This is no typical situation. Most projects do not involve any contributions requiring rich-format capabilities from an untrusted contributor. It's mostly about comments and forums, and let's be honest: Cotonti forums is an absolutely redundant feature. So it is just comments, which is pretty simple to solve. So do not exaggerate and overestimate the problem.
SED.by - создание сайтов, разработка плагинов и тем для Котонти
Kilandor
#20 2010-04-16 22:38
How is BBcode killing Cotonti?

BBcode serves its purpose perfectly fine and should be unchanged. My only point was, BBcode in no way should be removed.

The options should be something like this

  • BBcode(default)
  • BBcode with Filtered/Limited HTML
  • Filtered/Limited HTML
  • Pure HTML(could allow it, but not recommended)

Then the owner can also choose an appropriate editor for their needs. I've also seen features which could allow it to be possible for Users to decide which editor they want to use themselves

This can even be expanded to how bbcode is now already, which allows you to chose where its parsed or not pages/forums/etc

All above ideas & suggestions make very little sense because you're mostly talking about malicious code threats via contributors and contributions. This is no typical situation. Most projects do not involve any contributions requiring rich-format capabilities from an untrusted contributor

While this is true. It doesn't take but a moment for someone to get mad/angry, disagreements or fights to occur or whatever else, and for someone to suddenly one moment decide to do something malicious. So regardless of yes that in most causes that's not an issue. We can't just throw off over to HTML and hope for the best.
This post was edited by Kilandor (2010-04-16 22:56, 14 years ago)
Kort Online
#21 2010-04-17 01:18
How is BBcode killing Cotonti?
It's very simple: with BBCodes you'll have no typography. Meaning the website will not look attractive & professional. Returns instead of paragraph spacing (like in my example) bring your website back to the stone age.
# Kilandor : It doesn't take but a moment for someone to get mad/angry, disagreements or fights to occur or whatever else, and for someone to suddenly one moment decide to do something malicious. So regardless of yes that in most causes that's not an issue. We can't just throw off over to HTML and hope for the best.
This is pure fiction and shall in no way be considered as a reason to keep BBCodes.
SED.by - создание сайтов, разработка плагинов и тем для Котонти
GHengeveld
#22 2010-04-17 06:28
Sorry Kilandor, I don't agree with you at all.

In my professional opinion, BBcode is a solution to a prehistoric problem, from back in the days when there were no good RTEs around. I simply cannot justify, in ANY way, to make my customers learn BBcodes (no matter how easy it is), while professional solutions offer real time wysiwyg editing. The usability of RTEs is simply a leap ahead of BBcodes. And lets face it: end-users do not care about security issues. They just want something that is easy to use and doesn't require learning anything new.

Going into detail on the security aspects of allowing HTML, there are very good solutions around to minimize the risk of a security issue. Purifying code and blacklisting / filtering certain tags is easy to implement and the performance hit is very acceptable, considering the added benefits.

I have no intention to remove BBcodes entirely and I would advocate giving the most security minded administrator the option to disable HTML in favor of BBcodes, but in the end I strongly believe HTML should become the default setting.
urlkiller
#23 2010-04-17 18:04
agreed! but never the less i would let the admin's choose...
URL shortener: <a href="http://bbm.li/!7AD5C7">http://bbm.li/!7AD5C7</a>
donP
#24 2010-04-17 19:37
Ok, we can leave the Admin to choose the parsing method for his Cotonti website, but the choice must be at starting-installation moment and definitive.
I mean: if I choose HTML I want my Cotonti Database and Core to be BBCodes free (meaning: no double table in database to store the same content like now [bbcoded text and htmltransformed one] and no double parser process to verify if the content was sumitted in bbcode or html format).
in [color=#729FCF][b]BLUES[/b][/color] I trust
Kilandor
#25 2010-04-17 20:24
As I said before. We can have HTML options, but there is no reason BBcode should be removed. It has many useful purposes.

As far as BBCode and typography its only as limiting as the bbcode itself. If I choose to color something with Hello it has the exact same function as <span style="color:#FFFFFF;">Hello</span>

I would never expect my users to remember large amounts of HTML when they can remember as simple BBcode line.

And screw security issues. Its not my reasoning for wanting to keep BBcode.

Trust and me talked decided what will likely happen. We can and will provide both options, and are likely even going to provide 2 Core editors in the future. The admin can choose what they want or don't want.

And on another note. Please stop attacking my website Kort, your starting to make this personal. It in no way should be apart of this discussion.
Kort Online
#26 2010-04-17 20:31
# Kilandor : And on another note. Please stop attacking my website Kort, your starting to make this personal. It in no way should be apart of this discussion.
Well, that was just an example of poor typography produced by a BBCode advocate, nothing more. Please accept my sincere apologies if I hurt you.
--
As for the rest, bbcodes is essentially a hard-to-control unmarked flow of text with occasional spans & divs. And playing with linebreaks trying to align blocks of text or inserting headings (which you do often, unlike playing with colors) is really a tough job [to explain a customer]. With the bbcodes you're never able to control paragraph spacing (rather than using "return-return" which was forgotten long time ago). There's alot more, but I think that is enough to understand most bbcode shortcomings.
And one last thing (I apologize over again if this hurts): gamer communities is what ldu/seditio were for. Cotonti is much more than that, I hope.
SED.by - создание сайтов, разработка плагинов и тем для Котонти
This post was edited by Kort (2010-04-17 20:45, 14 years ago)
Kilandor
#27 2010-04-17 20:36
In the example you provided 0 BBcode is used. Its nothing more than plain text and <br />'s
Kort Online
#28 2010-04-17 20:47
see above. Just in case: in the example the spacing between 2 paragraphs is almost twice as big as the spacing between the news item and the "date" & "posted by" blocks, which is neither nice nor logical. That's because you're using returns to control paragraph spacing which is because you're using bbcodes which is why using bbcodes is wrong!
SED.by - создание сайтов, разработка плагинов и тем для Котонти
Kilandor
#29 2010-04-17 21:20
Well that is a different issue that could be solved via a config honestly.

In part of the parsing used to output text, it is a parameter of the function, but it is hard coded. All text is sent though the function nl2br() This converts all Line breaks to the HTML tag "<br>".

So really that issue is not caused by BBcode itself.

You could edit your system/functions.php

function sed_parse($text, $parse_bbcodes = TRUE, $parse_smilies = TRUE, $parse_newlines = TRUE)
{
	global $cfg, $sys, $sed_smilies, $L, $usr;

To
function sed_parse($text, $parse_bbcodes = TRUE, $parse_smilies = TRUE, $parse_newlines = TRUE)
{
	global $cfg, $sys, $sed_smilies, $L, $usr;
	$parse_newlines = FALSE;

If you do this, text should act as expect if you just threw a bunch of text into a <div> it would be all cramped together.

I understand what you are referring to about. Visual logic tells a person to hit enter 2x so they see a paragraph, but in doing so causes 2 <br> to be generated.(clearly I do it myself). And I think any way to remove the multiple would prevent anyone from making actual multiple breaks as or if they wanted.

So what we also need to do is add an option to prevent this from being used for people who do not want it.

Actually upon further inspection although minor we should be using nl2br($text, true) so it does xhtml compatable "<br />" instead of "<br>"
Kort Online
#30 2010-04-17 21:50
Same with
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. [right]Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.[right] Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
and
Lorem ipsum dolor sit amet, consectetur adipisicing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua.
[right]Ut enim ad minim veniam, quis nostrud exercitation ullamco laboris nisi ut aliquip ex ea commodo consequat.[right]
Duis aute irure dolor in reprehenderit in voluptate velit esse cillum dolore eu fugiat nulla pariatur. Excepteur sint occaecat cupidatat non proident, sunt in culpa qui officia deserunt mollit anim id est laborum.
First code works as needed, but is hard to read, second one looks more logical but adds undesired spacing.
Imho, adapting bbcodes is not worth the efforts, so we've switched to html-parsing at seditio.by and all the new projects for the sake of simplicity and flexibility.
SED.by - создание сайтов, разработка плагинов и тем для Котонти

<<<12345>>>