Fix for keeping and using custom Cheerio options (Issue #273)#274
Fix for keeping and using custom Cheerio options (Issue #273)#274Torthu wants to merge 2 commits into
Conversation
Cheerio.prototype.options are now overwritten if Cheerio is instantiated with an options object. cheerioinstace.options is now passed to parser when adding new dom nodes to an existing Cheerio instance.
There was a problem hiding this comment.
Nitpicks:
- include a space after
if - use single quotes
|
Nice patch! |
There was a problem hiding this comment.
I think you meant, "should force lowercase for new nodes".
But really, the functionality you've implemented forwards all parsing options. We should write the tests to verify this a little more generally. Mind adding another assertion to each of these tests for the ignoreWhiteSpace option? Then we'd want to name these something like, 'respects parsing options defined when the context was created'.
There was a problem hiding this comment.
You are right, and I agree.
|
This approach has potentially-surprising behavior--it modifies Cheerio globally. For instance, the following test will fail: it('should maintain the parsing options of distinct contexts independently', function() {
var str = '<g><someElem someAttribute="something">hello</someElem></g>';
var $x = cheerio.load('', { xmlMode: false });
var $h = cheerio.load('', { xmlMode: true });
expect($x(str).html()).to.equal('<someelem someattribute="something">hello</someelem>');
});I've implemented a fix for this and opened a pull request against your branch: Torthu#1 @davidchambers What do you think about this approach? It adds a bit of overhead, but I think it's important to keep contexts separate. |
|
This pull request is a hack and definitely shouldn't be merged. It introduces circular dependencies, which is probably the worst way to do this. Instead, patching |
|
Generally speaking, keeping an |
|
Sure, I'm up for it, but I must admit that my understanding of the codebase is minimal at best. |
|
That's fine! I'll start off with our high-level goals, then get into specifics GoalsParsing options aren't really handled in a consistent way right now, but as far
I want to draw attention to the "subsequent" part, since one might argue that Cheerio.prototype.options.lowerCaseTags = false;
var $ = Cheerio.load('<div></div>');
Cheerio.prototype.options.lowerCaseTags = true;
// Should the behavior of this operation reflect the current state of the
// options?
$.append('<H1>Hello</H1>');If we want this, we'll have to constantly reference (and merge) instance- and CodeBasically, we need to be able to specify parsing options at instantiation time. diff --git a/lib/cheerio.js b/lib/cheerio.js
index 2c847e1..91a8008 100644
--- a/lib/cheerio.js
+++ b/lib/cheerio.js
@@ -31,8 +29,10 @@ var $ = require('./static');
* Instance of cheerio
*/
-var Cheerio = module.exports = function(selector, context, root) {
- if (!(this instanceof Cheerio)) return new Cheerio(selector, context, root);
+var Cheerio = module.exports = function(selector, context, root, options) {
+ if (!(this instanceof Cheerio)) return new Cheerio(selector, context, root, options);
+
+ this.options = _.defaults(options || {}, Cheerio.prototype.options);
// $(), $(null), $(undefined), $(false)
if (!selector) return this;...then update all the calls to Cheerio.prototype.make = function(dom) {
- return new Cheerio(dom);
+ return new Cheerio(dom, undefined, undefined, this.options);
};And the static method Now that I've written this out, it's clear that my approach requires some API
I'd like to get some feedback from @matthewmueller, @fb55, and @davidchambers |
We should avoid global state. I'd prefer to see the following: var $ = Cheerio.load('<div></div>', {lowerCaseTags: false});If cheerio's API provided a way to create an independent constructor function, mutating its prototype would be okay. This could, perhaps, work as follows: var init = function() {
function Cheerio(...) {
...
}
Cheerio.prototype.find = function(...) {
...
};
...
return Cheerio;
};// a.js
var Cheerio = require('cheerio').init();
Cheerio.prototype.foo = 42// b.js
var Cheerio = require('cheerio').init();
Cheerio.prototype.foo // => undefined |
Generally speaking, I agree with you. But I think this might make sense for people that want to concisely configure Cheerio for their entire module (no need to specify the same options everywhere).
This sounds a little like what I proposed to @Torthu in GH-1: Create a unique prototype for each Cheerio context |
|
I'm bumping this issue because we could really use some feedback before building this out. Does the proposal I made above seem alright? Should we forget about setting options globally, as @davidchambers recommended? It's important for the library to consistently handle these options so that it behaves in a predictable way. Note that this issue also blocks #261. |
|
The code is kind of a mess. We have to pass Even with the above mentioned |
|
@stevenvachon Most functions are attached to cheerio's prototype and can access instance properties. Just modifying |
|
@fb55 They are, but they break away from it in each module. For example, cheerio.js has this: var api = ['attributes', 'traversing', 'manipulation', 'css'];
api.forEach(function(mod) {
_.extend(Cheerio.prototype, require('./api/' + mod));
});but then manipulation.js has this: var makeDomArray = function(elem) {
if (elem == null) {
return [];
} else if (elem.cheerio) {
return elem.toArray();
} else if (_.isArray(elem)) {
return _.flatten(elem.map(makeDomArray));
} else if (_.isString(elem)) {
return evaluate(elem);
} else {
return [elem];
}
};
var _insert = function(concatenator) {
return function() {
var elems = slice.call(arguments),
dom = makeDomArray(elems);
return this.each(function(i, el) {
if (_.isFunction(elems[0])) {
dom = makeDomArray(elems[0].call(el, i, this.html()));
}
updateDOM(concatenator(dom, el.children || (el.children = [])), el);
});
};
};
var append = exports.append = _insert(function(dom, children) {
return children.concat(dom);
});We'll have to pass |
|
@stevenvachon Or, these methods could be added to the prototype (prefixed with an underscore). But I get your point. |
|
Anything happening here? I've had to special case this in my code and I'd much prefer it to use something that won't break in the future. |
|
@stevenvachon As far as I know, there is currently nobody working on this. Feel free to send a pull request, though :) |
as proposed by @jugglinmike in #274
as proposed by @jugglinmike in #274, but using `this.options` instead of the prototype to enable `Cheerio.call` calls to work properly
|
I'm closing this in favor of #437. |
Cheerio.prototype.options are now overwritten if Cheerio is
instantiated with an options object.
cheerioinstace.options is now passed to parser when adding new dom
nodes to an existing Cheerio instance.