Skip to content
This repository has been archived by the owner on Sep 18, 2019. It is now read-only.

Preserve invalid/ignored tokens for minifiers #2

Open
SimonSapin opened this issue Jun 13, 2012 · 4 comments
Open

Preserve invalid/ignored tokens for minifiers #2

SimonSapin opened this issue Jun 13, 2012 · 4 comments

Comments

@SimonSapin
Copy link
Member

Currently tinycss just drops tokens for invalid declarations or rules. This is fine for an UA like WeasyPrint, but not for a minifier: what is invalid now may become valid in the future. A minifier should pass it through and not drop it.

To enable this, tinycss should get a "preserve invalid" mode (off by defaut) where:

  • Any list of Declaration objects can also contain InvalidDeclaration objects which have a TokenList
  • Any list that can contain at-rules can also contain unparsed AtRule objects whose head and body attributes are TokenList’s

The precise API is still to be determined

@kennyluck
Copy link

Some of the (hypothetical) user scenario:

  1. A tinycss UA developer (meaning that he/she is unwilling to build a completely new parser from scratch) wants to support CSS variables/hierarchy before tinycss supports it.
  2. A tinycss UA developer wants to build a minimizer.
  3. A tinycss UA developer wants to build a CSS pretty printer.

For 2. a list of parsed tokens for a CSS file would be enough but that won't satisfy 3. For 1. it would indeed be nicer if tinycss provides something like InvalidDeclaration so that the tinycss UA developer won't need to override parse_declaration and potentially got lost in css21.py. He/she would still need to understand token_data.py I think, so yeah, the requirement list looks right.

This is all a bit hypothetical though, so don't prioritize it. I am a lot more interested in WeasyPrint than tinycss in fact :p

== the following is irrelevant to tinycss ==

I hope my intention of mentioning tinycss is clearer now. My goal is to reduce the number of people who treat the CSS 2.1 Core Grammar as some divine and unchangeable thing. Heck, some of those people keep mentioning the Core Grammar without really knowing a concrete situation that something will break. I don't support the $foo: bar idea, but I'd rather see opponents of the $foo: bar idea don't get categorized into unreasonable people just because CSS 2.1 Core Grammar is itself very unreasonable.

There are basically two ways to make this situation better:

  1. Obsolete the forwards-compatible layer idea.
  2. Make sure that the forwards-compatible layer isn't dropping anything, so that future implementation of this layer won't be cited as reasons why we can't change the grammar.

I thought css3-synatx was doing 2. so I'd rather see the description match a real implementation and that's basically tinycss with "preserve invalid" on, I think. But since Tab's intention seems to be 1., I am fine but I just hope that it's clearly stated that CSS 2.1 Core Grammar is obsoleted and no one will try to build an intermediate layer on top of that grammar. At least that implementation itself can't be claimed conforming and that implementation can't be cited as reasons why we shouldn't change the Core Grammar — equivalently, there should be nothing special about the CSS 2.1 Core Grammar. It's no more and no less stable than, say, the value of a property.

Since you asked me

how would you have implemented a (CSS 2.1 Core) parser without it (the CSS 2.1 Core Grammar)

, I think I would have implemented it using the extended CSS 2.1 Core Grammar that I wrote down, and that will just basically be the "preserve invalid" idea here. (except that I made no distinction of a Declaration to a InvalidDeclaration and that's not going to be usable in practice, so, yeah, it's again theoretical.)

I hope I am clear now.

@SimonSapin
Copy link
Member Author

I think that this feature is all that is needed in the parser for both 2 and 3. The difference between those is "just" whether to remove or add whitespace. Of course for 3. you’d need more code to keep track of the indentation level, but that’s not in the parser.

1 would require to either patch tinycss and send a pull request, or to extend the parser. Of course extending require a lot of context and knowledge of how things work but hopefully the documentation can help. I wouldn’t use InvalidDeclaration for this.

tinycss is designed to be somewhat forward-compatible through extending, not by keep around blobs of unparsed token that would be parsed "later". Of course you can do the latter, but I think it is not the "right thing" to do. Actually the docs already have an extension example with the star hack. This was included after I got a private email from someone making a minifier … by I don’t think this project was ever published.

@kennyluck
Copy link

I think that this feature is all that is needed in the parser for both 2 and 3. The difference between those is "just" whether to remove or add whitespace. Of course for 3. you’d need more code to keep track of the indentation level, but that’s not in the parser.

Yeah. My point is that if 2. is the only scenario, it probably would be better to either just support minimizing natively or provide a TokenList that represents the whole file.

Of course you can do the latter, but I think it is not the "right thing" to do.

OK. I don't think it's a bad idea to not to design around misuses of the system. Just pointing out some potential (mis)use cases here. :p

@SimonSapin
Copy link
Member Author

If you want a TokenList for the whole file, just use the tokenizer without the parser. This is fine to remove comments and collapse whitespace, but not for more advance minification since you don’t know eg. where declarations are.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants