Joomla! Discussion Forums



It is currently Wed Nov 25, 2009 6:32 am (All times are UTC )

 




Post new topic Reply to topic  [ 120 posts ]  Go to page Previous  1, 2, 3, 4  Next
Author Message
 Post subject: Re: 1.5 DB schema
Posted: Sat Jul 29, 2006 12:30 pm 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mjaz,

there's no real short description why using autoincremented fields for reference should be avoided. Nevertheless I try to mention some of most important keys:

  • nonportable functions
  • unreliable in multiuser environment without transactions
  • problems when transferring data between multiple systems (servers of same product or different ones) especially when number of existing data is different
  • possible problems with bookmarking actual datarow with multiple DBMSs

This must be enough for now. I could sit down much longer time and type many more but, those issues mentioned are such fundamental basics in data management to leave autoincrement alone in most cases if one wants to have a reliable system for multiuser environment. It takes not more than four small functions (even in Fortran) to have a reliable and portable id generator for each and every DBMS. Doing it this way one has so called 'total control' which is what reliable applications provide.

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Last edited by diri on Sat Jul 29, 2006 12:37 pm, edited 1 time in total.

Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Jul 29, 2006 3:45 pm 
User avatar
Joomla! Guru
Joomla! Guru
Offline

Joined: Thu Nov 10, 2005 10:08 am
Posts: 807
Thanks for explaining!
diri wrote:
(...) It takes not more than four small functions (even in Fortran) to have a reliable and portable id generator for each and every DBMS. Doing it this way one has so called 'total control' which is what reliable applications provide.


If I understand it correctly, the application is responsible for generating an ID. But how do you make sure it's unique?

Has anyone ever experienced issues in Joomla related to autoincremented fields, or is this more of a theoretical thing? If it is an issue, then I suppose it should be on the roadmap for 2.0.

BTW Can anyone recommend a good book or website on all of this? I've never made anything that uses more than 3 or 4 simple tables, but somehow I find all this DB stuff very interesting.

_________________
Better SEO & multi-lingual Joomla sites with Nooku Content
http://www.nooku.org
Nooku Framework for advanced Joomla extension development
http://www.nooku.org/framework


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Jul 29, 2006 4:30 pm 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mjaz wrote:
If I understand it correctly, the application is responsible for generating an ID. But how do you make sure it's unique?

Use one table with field for last number used for critical items. Get this number, add one (+1), store new number when old stored number is lower than new one. ("update field(new_value) when new_value > old_value;"). Otherwise start again with adding one (+1).
mjaz wrote:
Has anyone ever experienced issues in Joomla related to autoincremented fields, or is this more of a theoretical thing?

It's nothing theoretical in multiuser environment. You must also be aware to not to be in clean room environment with Joomla! where db server and web server run on different machines. Without transaction one has to take everything into account there ...

Check it for yourself with a somebody working with different bandwidth. You both have to edit a new item and press save at the very same moment. Ensure to get saved item displayed directly afterwards. Most (lazy) programmers use lastinsert_id to get data to display.

Than you might be able to see what happens:
Dependant on who had faster line and difference in speed, load of servers and so on you might get wrong datarow displayed. A very prominent example for this behaviour is Microsoft's Access even in single user mode after adding a new row since many years (bookmark bug).

When there is no additional check introduced it might even overlap while storing data in db. It can be an issue when something is stored in different tables in dependency of lastinsert_id without explicit having the id at hand. Lazy programmers use lastinsert_id here as well.

Efforts to work reliable with autoincrement are at least as much as working without it (for me at least). Therefore I leave all such things to my applications to be able to do everything I want to do while having control over all.
mjaz wrote:
BTW Can anyone recommend a good book or website on all of this? I've never made anything that uses more than 3 or 4 simple tables, but somehow I find all this DB stuff very interesting.

Have a look for trivial SQL explanations first. You will encounter interesting things when you are at joins and unions. Than have a look for programming in multiuser environment (general, db). Than have a look about programming in stateless environment. You can also read product specific FAQs. When there are issues to watch it is mentioned in every FAQ (normally).

When you went through this, you are there. Have fun. ;)

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Jul 29, 2006 5:43 pm 
User avatar
Joomla! Guru
Joomla! Guru
Offline

Joined: Thu Nov 10, 2005 10:08 am
Posts: 807
diri wrote:
When you went through this, you are there. Have fun. ;)

Thanks! When I'm there, I'll offer my insights to the team developping J! v6.0  ;)

_________________
Better SEO & multi-lingual Joomla sites with Nooku Content
http://www.nooku.org
Nooku Framework for advanced Joomla extension development
http://www.nooku.org/framework


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Jul 29, 2006 11:14 pm 
User avatar
Joomla! Ace
Joomla! Ace
Offline

Joined: Wed Aug 17, 2005 11:03 pm
Posts: 1356
Location: New Orleans, Louisiana
In the brainstorming/conceptual phase of 2.0 development when we are laying out specifications and setting requirements, I absolutely want to address the use of autoincrement fields as well as any other "non-standard" sql in our core queries.  I want to optimise the schema and also optimise the queries.  It will be important to have input and perspective from some of our more experienced DBAs.  I sincerely hope that when this time comes I can count on some of you to step up and help us work out the best data model layer possible.

Louis

_________________
Development Working Group Coordinator
http://webimagery.net - Consulting
http://jxtended.com - Solutions for Joomla! 1.5
A hacker does for love what others would not do for money.


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 1:07 am 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Sun Jan 29, 2006 9:20 am
Posts: 22
diri wrote:
you will be out of luck when trying to dive into Joomla! with database structure as starting point.

When exploring database one would think almost everything is fine. But: What you see there and what Joomla! is doing at runtime are different kind of beasts. Your problem seems to be related to one beast, ItemId is another example.


In this case I don't see any other option than to try and fix corruption (more correctly bad data) in the tables. Joomla won't show me most of the 600 or so members of the site. The problem seems to be duplicated records except for the mismatched numbers that I gave an example of. Making these match (in a test I did a couple of months ago - too busy since then) did make a missing member appear, but the duplicate entries still exist and giving this user extra editing rights doesn't have any effect. Somehow I need to unpick this mess without breaking someting else that I may not notice.

I see transactions are being mentioned - that would be a novelty! I'd have thought that they'd be a pre-requisit.

Reagrds
Mike


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 3:37 am 
louis.landry wrote:
In the brainstorming/conceptual phase of 2.0 development.... It will be important to have input and perspective from some of our more experienced DBAs.


Ah, let some of us hacks in there, too! Actually, Louis, it gives me goosebumps when you talk about that! I can't wait! Let's skip ahead! Thanks for your welcoming attitude towards us, always. It is so much appreciated.


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 6:49 am 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Thu Feb 02, 2006 2:28 am
Posts: 18
just downloaded the DbDesigner application ... A really nice open source product ...

In a past life we had to do all this sort of thing on graph paper ... or at best using AutoCAD  :(

If only we had had a tool like this when MainFrames were King!!!!


Any chance of getting the Joomla schema DbDesigner file rather than the .png graphic ...?

I could find the graphic file over on the Developer network ... but not a DBDesigner file ....

_________________
Kind Regards,

Gary Wilkinson
Principal Consutant
SQR Consulting


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 9:34 am 
User avatar
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Aug 24, 2005 9:34 am
Posts: 662
Location: Bodø, Norway
GaryW wrote:
just downloaded the DbDesigner application ... A really nice open source product ...

In a past life we had to do all this sort of thing on graph paper ... or at best using AutoCAD  :(

If only we had had a tool like this when MainFrames were King!!!!


Any chance of getting the Joomla schema DbDesigner file rather than the .png graphic ...?

I could find the graphic file over on the Developer network ... but not a DBDesigner file ....


I'll upload it when I get the chance, but I'm on vacation right now (again) so it will have to wait a week or so.

_________________
Torkil Johnsen
BEDRE Reklame as - http://www.bedre.no
twitter.com/BEDREreklame


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 9:51 am 
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Thu Feb 02, 2006 2:28 am
Posts: 18
Thanks torkil ...

On vacation ...?  Get your head out of the computer ... and go out and get some fresh air as my old dad says ... :D

_________________
Kind Regards,

Gary Wilkinson
Principal Consutant
SQR Consulting


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 10:05 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mjaz wrote:
Thanks! When I'm there, I'll offer my insights to the team developping J! v6.0  ;)

You are welcome! :)

Don't be afraid, even Rome has not been build in one day. Even old horses can (and must) learn more in every relation.  :'(

Related do Joomla!:
Most underestimated issue is programming in multiuser and stateless environment. This can cause a lot of headache as long as code does not control almost everything. Be aware that the enemy for every programmer in web environment is back button of browser. Multiple users especially in combination with unreliable bandwidth are next trouble causing issues.

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Jul 30, 2006 10:16 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mpetrie wrote:
I see transactions are being mentioned - that would be a novelty! I'd have thought that they'd be a pre-requisit.

They are no pre-requisit. With mySQL they are usable since v5.

Transactions also have a direct drawback: Loss of performance.

For efficient use it takes some efforts at coding side as well as when working without. Dependent on what has to be done I prefer to work with or without transaction. Nevertheless I consider them to be much more than a nice to have.

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Thu Aug 03, 2006 6:47 pm 
Joomla! Intern
Joomla! Intern
Offline

Joined: Sat Dec 31, 2005 7:22 pm
Posts: 61
Quote:
thank you for your agreement. But, transactions only are a "nice to have" in case every DBMS being used supports them. For real independancy (portability) especially in web environment coder has to implement kind of transaction system for itself. Be aware of the fact that most web sites don't run db service on same machine like web server and you see the bottleneck related to secure data storing.



diri... I believe that one of the objectives between 1.5 and 2.0 is to add a database abstraction layer. Doing so may have been postponed to 2.0 because there is currently no DBA working on the project. Much earlier, there was talk of using ADODB or ADODB-Lite for database abstraction, but I do not know if this is still the case. ADODB does simulate transactions for those RDMS who do not support transactions directly. With ADODB, you write your queries using class methods which automatically handle RDMS differences in the ADODB core. I'm not sure if performance problems arise from using this approach for those RDMS that do not support transactions, but it should be reasonably easy to monitor for performance problems with proper testing. .

Technically, Joomla! does not have a problem physically storing data on a remote database server. It's possible to declare the server URL and server port in configuration.php. Data security is more of a host problem for most Joomla! users, but technical documentation for more advanced users may be warranted if it does not already exist.


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Fri Aug 04, 2006 7:47 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mmx wrote:
diri... I believe that one of the objectives between 1.5 and 2.0 is to add a database abstraction layer. Doing so may have been postponed to 2.0 because there is currently no DBA working on the project. Much earlier, there was talk of using ADODB or ADODB-Lite for database abstraction, but I do not know if this is still the case. ADODB does simulate transactions for those RDMS who do not support transactions directly. With ADODB, you write your queries using class methods which automatically handle RDMS differences in the ADODB core. I'm not sure if performance problems arise from using this approach for those RDMS that do not support transactions, but it should be reasonably easy to monitor for performance problems with proper testing. .


I used ADODB in many cases to get portability with existing systems being focused on one DBMS originally. I could not encounter real problems related to performance when DBMS used does not support transactions. The real problem is to not to "forget" a SQL statement in existing code.

For being kind of stoneaged person I always followed approach to get as much static data as possible to lower load of servers involved and to increase speed of delivery. Due to this kind of intelligence is also needed with caching.

mmx wrote:
Technically, Joomla! does not have a problem physically storing data on a remote database server. It's possible to declare the server URL and server port in configuration.php. Data security is more of a host problem for most Joomla! users, but technical documentation for more advanced users may be warranted if it does not already exist.

I'm in happy position to have multiple servers available in my development environment and use this feature very often. It is also good to perform stress tests related to server and network for itself. I can simulate hosting environment to some degree while introducing additional traffic on NIC used.

Related to data security it's not only a question of host - it's also a question of code. Try to simulate more than two (2) page impressions per second and you will see what I mean. Distinguish between page view and page impression - in practice this might be 10++ page views per second. Watch your server for some minutes, enter and view data. ;)

Documentation would be fine but, I'm a programmer for myself and know about the problem with it.  :-[

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Last edited by diri on Fri Aug 04, 2006 7:50 am, edited 1 time in total.

Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Fri Aug 04, 2006 6:29 pm 
Joomla! Virtuoso
Joomla! Virtuoso
Offline

Joined: Fri Sep 02, 2005 10:06 am
Posts: 3071
Location: Solar system - Earth - European Union
I'd like to point out what just said:

diri wrote:
Related to data security it's not only a question of host - it's also a question of code. Try to simulate more than two (2) page impressions per second and you will see what I mean. Distinguish between page view and page impression - in practice this might be 10++ page views per second. Watch your server for some minutes, enter and view data. ;)

_________________
former Q&T WorkGroup Joomla member - Italian Translation Team Member


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Aug 05, 2006 1:51 am 
Diri - Newart -

Help the dumb old woman from Nebraska. The suspense is killing me. What do you mean by this?

Amy


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Aug 05, 2006 8:42 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
Amy,

a well-known problem for many people is to not to be aware of load being caused on servers and network (NIC) when website is visited. Be aware of each little detail in sequence while a page is build and delivered. Such a discussion will open a can of worms and I will interrupt very  harsh later on. ;)

To explain it some keywords must be explained.

Many people think there each hit is always a page impression. That's not true:
First you have a view which can become an impression. A page impression is full transfer (and receipt) of all page related data to browser.

In reality there are much more (count 5 - 10 times or more) page views where only part of data is displayed in browser.

This difference very often is result of slow delivery of content and further clicks of visitor.

Now, calculate load of servers for yourself following an example:
DB server is running on a different machine than webserver. One page needs 50 SQL queries to get all data.

This means:
Visitor touches webserver with one small request (inbound traffic). Webserver triggers PHP (in our case) which in turn runs code related (get data from DB server (outbound / inbound traffic), do what code wants to do with it, etc. pp.). Due to dynamic construction of the page there are few SQL statements at beginning of this process which cause later queries based on data involved.

It's like table tennis: Ping - Pong - Ping - Pong - ...

Try to estimate number of db requests when there are 10 or more pages requested per second (views and impressions!).

Here we are at security:
Code must be aware of interrupts in execution, delayed transfer of data and all things possible in multiuser environment as well as networks. First sign of a DoS can be a missing (not delivered) picture on a page.

In a previous job I had to deal with a CMS not being aware of this:
Code was ready with the task within short time but, it's loops took more than 5 minutes to end program after every start. This caused DoS because ten visitors within short time brought server down (normally this site had about 1.000.000 page impressions (!) per month). To get arguments when speaking with customer I hacked this CMS with a normal browser while causing a DoS and getting error messages which helped to dive deeper without knowing code used.

Remark:
System for itself was save (I have been the developer of setting scheme being used :) ) but, CMS has been hackable. To help customer I set webservers timeout to 10 seconds instead of default value (Apache: 300). The coders started crying and whining ...

In this example you can see very easy why it is so important to have a close look at efficiency of code and schema in db. Lowering number of queries alone does not solve all problems, those queries must be efficient as well because a DoS of DB server can also open holes. This is reason why I wrote about lesser normalization.

Server can be whatever value of  money you want to spend for it - it does not matter when coding is bad.

Try to bring some servers down with many reloads of different pages or run multiple wget processes against them. You will have luck in many cases due to reality in hosted environments. CMSs used might be hackable than even when they are "safe" with little number of requests.

An easy example to fill a 100 mbit/s line is to use one of those download accelerators (Get it Right or the like) to download a file when sitting behind a proxy cluster. Due to functionality of such download tools there will be a large number of (sometimes identical!) partial requests which in turn are proxied by such clusters!

A user with a slow analog modem is able to fill a fast least line for hours this way. Line is not available for other users, NIC of server is also full loaded because cluster is served as fast as possible. Client is not able to receive data in time and requests parts again. Result: DoS.

I still have according logs (Cisco, webserver) available because this happened at one computer centre managed by me. This is one of the reasons why I refuse to save downloadable files in DB - imagine number of queries in a case of using such tools and many downloads.

You see, there are many issues which should be taken into account - it must be weighted.

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Aug 05, 2006 5:30 pm 
Yikes! Okay, Diri, I am going to learn lots and lots and lots from knowing you. Thanks for taking the time to educate me and spell things out so clearly. I will have to read that many more times, I am certain. That was so good!

BTW: That was very nicely written, as well. There was only one thing in there that was, using your term: harsh, and it was "The coders started crying and whining ... ", but you were talking about other people not us! So, that works! lol.

I think sometimes technical people think it's off topic to talk about HOW we talk to each other. And, if you will allow it, I want to quickly address why I think it's important. We can't hear each others voices - so, it is easy to put in a tone of voice that was not intended. And, especially for conversations like these -- where it is so technical and complicated and it requires close thinking and total focus -- we have to be extremely careful with word choice, so that we avoid making people defensive. If they are defensive, they start thinking about how to protect themselves, and are not focusing on the problem anymore. And, often, the group ends up heading into directions that have nothing to do with the real issue.

I just wanted to say that because there are obviously really smart people on this list and we need those brains zoning in on real challenges, just like Diri did on this last post. If we can do that in a way that encourages people, I am really excited about what might be accomplished!

Thanks, Diri for your involvement with Joomla!. You already have made a huge impact and I know you are going to help in really amazing ways. (And, I am just going to hang on and try to learn what I can from you all!  ;))

Amy  :)


Last edited by AmyStephen on Sat Aug 05, 2006 5:33 pm, edited 1 time in total.

Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sat Aug 05, 2006 8:40 pm 
Joomla! Intern
Joomla! Intern
Offline

Joined: Sat Dec 31, 2005 7:22 pm
Posts: 61
Diri... you seem to be suggesting the need for finer grained control of page construction and the need to cache smaller fragments of a page and/or views, plus the associated queries of those view fragments, allowing the update of smaller areas of a page. Is this so?


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Aug 06, 2006 11:16 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mmx wrote:
Diri... you seem to be suggesting the need for finer grained control of page construction and the need to cache smaller fragments of a page and/or views, plus the associated queries of those view fragments, allowing the update of smaller areas of a page. Is this so?


hm. Depends.  8)

There is one general problem with using cache really efficient:
Templating, especially features like template_chooser. User (site op) has to take care about identifiers used.

I would assign caching to step 2. Step 1 would be DB and SQL.

Independent of this in very first moment one has to split a site into two parts:
  • static
  • dynamic

Static part is something being changed very seldom like legals. Due to Joomla!'s binding of backend to same server like frontend it might be possible to save such content in cache directly after publishing / change.

Dynamic part is vague. There is dynamic content not being changed that often and can be called kind of static for this. Other content might be added / changed every minute (i.e. forum). Some features can cause a lot of headache in this relation.

Now to step 1:

Improvement of systems can be achieved very often when having a close look at SQL statements and DB scheme. The more normalization is done at DB scheme the more power is needed to get / save all data needed.

I recommend to double-check each occurrence of "SELECT * FROM" (*). I strongly suggest to check each SQL statement having a JOIN statement (**). Changing code according change of SQL statements will be necessary afterwards.

Why check those statements with such priority:

(*) Get as much data as needed. Get exactly what you need to have. Not more, not less.

(**) There are data where normalization can cause a lot of overhead in DB as well as in code. A typical example are so called xref-tables. Very often they are used to combine data not needing any normalization with normalized data. Accessing related additional xref-table takes performance at DB, code and network while introduction of one field at un-normalized data table would have been sufficient normally without taking space or efficiency (the opposite applies).

Most probably one has a lot of potential for improvement there even when it takes a little more space in DB.

Than one can have a close look at caching. What to do there depends on what you have after changes mentioned above.

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Sun Aug 06, 2006 9:01 pm 
Joomla! Intern
Joomla! Intern
Offline

Joined: Sat Dec 31, 2005 7:22 pm
Posts: 61
For 1.5, 1.6, and so on... much of this thread can do nothing to improve the schema until the need to be backward compatible is no longer a necessity. At the moment, the developers are trying to avoid altering and adding columns until development begins for 2.0. That said, most of this discussion really belongs in a public 2.0 schema development thread that currently does not exist.

diri wrote:
mmx wrote:
Diri... you seem to be suggesting the need for finer grained control of page construction and the need to cache smaller fragments of a page and/or views, plus the associated queries of those view fragments, allowing the update of smaller areas of a page. Is this so?


hm. Depends.   8)

There is one general problem with using cache really efficient:
Templating, especially features like template_chooser. User (site op) has to take care about identifiers used.

I would assign caching to step 2. Step 1 would be DB and SQL.

Independent of this in very first moment one has to split a site into two parts:
  • static
  • dynamic

Static part is something being changed very seldom like legals. Due to Joomla!'s binding of backend to same server like frontend it might be possible to save such content in cache directly after publishing / change.

Dynamic part is vague. There is dynamic content not being changed that often and can be called kind of static for this. Other content might be added / changed every minute (i.e. forum). Some features can cause a lot of headache in this relation.


A possible caching solution for parts one and two is to provide a table column for overriding the default cache lifetime for individual entities such as a content item, form/list or widget. Due to the backward compatability restriction, I seriously don't foresee a change like this until 2.0. In fact, it would be easier to implement with the new 1.5 framework after the node-based schema (NBS) is introduced in 2.0. For an example of this in a CMS without a NBS, take a look at XOOPS 2 which does permit user-definable caching of components (modules in XOOPS lingo).

However, it might be better if the NBS was completely abstract from content data and dealt with the container items (pages, forms, widgets, and their sibling containers) instead, encapsulating such things as content, embedded forms and lists, and widgets. In this way, content is descretely handled as content and can be cached with a sibling content container independently for use in multiple page containers and their associated impressions. For example, one could have a page container tree for rendering content, and each page container in the tree could store sibling containers (branches of the tree) with their associated content, embedded forms/lists, and widgets. In fact, an NBS with this capability should allow fine-grained view fragmention for caching fragments of a page container. To my knowledge, the current NBS schema does not include the columns necessary to interact with the branches of a nested set in a fashion where this is possible.

Quote:
Now to step 1:

Improvement of systems can be achieved very often when having a close look at SQL statements and DB scheme. The more normalization is done at DB scheme the more power is needed to get / save all data needed.

I recommend to double-check each occurrence of "SELECT * FROM" (*). I strongly suggest to check each SQL statement having a JOIN statement (**). Changing code according change of SQL statements will be necessary afterwards.

Why check those statements with such priority:

(*) Get as much data as needed. Get exactly what you need to have. Not more, not less.


I'm in agreement. Queries should be as specific as possible to gather only the data required. This is a possible solution for 1.5 since it should have no effect on the backwards compatability issue.

Quote:
(**) There are data where normalization can cause a lot of overhead in DB as well as in code. A typical example are so called xref-tables. Very often they are used to combine data not needing any normalization with normalized data. Accessing related additional xref-table takes performance at DB, code and network while introduction of one field at un-normalized data table would have been sufficient normally without taking space or efficiency (the opposite applies).


In some schema areas, we are probably going to see this happen natually as further development gets underway on the NBS. The hierarchy table in the Joomla! 2.0 NBS could include additional columns to handle non-normalized columns.

However, there are some factors related to internationalism and data versioning in 2.0 that might require a higher degree of normalization than we have now. For example, a single content_id might be associated with multiple content items written in different languages and there could be multiple versions of those content items for each language instance.

Quote:
Most probably one has a lot of potential for improvement there even when it takes a little more space in DB.

Than one can have a close look at caching. What to do there depends on what you have after changes mentioned above.

cu, diri


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Mon Aug 07, 2006 8:00 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mmx wrote:
For 1.5, 1.6, and so on... much of this thread can do nothing to improve the schema until the need to be backward compatible is no longer a necessity. At the moment, the developers are trying to avoid altering and adding columns until development begins for 2.0. That said, most of this discussion really belongs in a public 2.0 schema development thread that currently does not exist.

You are correct so far. Nevertheless I think it's a positive approach to start discussions now instead of later when time has to be used for decisions and coding.

mmx wrote:
diri wrote:
I would assign caching to step 2. Step 1 would be DB and SQL.


A possible caching solution for parts one and two is to provide a table column for overriding the default cache lifetime for individual entities such as a content item, form/list or widget. Due to the backward compatability restriction, I seriously don't foresee a change like this until 2.0. In fact, it would be easier to implement with the new 1.5 framework after the node-based schema (NBS) is introduced in 2.0. For an example of this in a CMS without a NBS, take a look at XOOPS 2 which does permit user-definable caching of components (modules in XOOPS lingo).

Idea implemented in XOOPS isn't bad IMHO.

mmx wrote:
However, it might be better if the NBS was completely abstract from content data and dealt with the container items (pages, forms, widgets, and their sibling containers) instead, encapsulating such things as content, embedded forms and lists, and widgets. In this way, content is discretely handled as content and can be cached with a sibling content container independently for use in multiple page containers and their associated impressions. For example, one could have a page container tree for rendering content, and each page container in the tree could store sibling containers (branches of the tree) with their associated content, embedded forms/lists, and widgets. In fact, an NBS with this capability should allow fine-grained view fragmention for caching fragments of a page container. To my knowledge, the current NBS schema does not include the columns necessary to interact with the branches of a nested set in a fashion where this is possible.

Related to categories and sections all fields needed are available now ...

mmx wrote:
Quote:
(**) There are data where normalization can cause a lot of overhead in DB as well as in code. A typical example are so called xref-tables. Very often they are used to combine data not needing any normalization with normalized data. Accessing related additional xref-table takes performance at DB, code and network while introduction of one field at un-normalized data table would have been sufficient normally without taking space or efficiency (the opposite applies).


In some schema areas, we are probably going to see this happen naturally as further development gets underway on the NBS. The hierarchy table in the Joomla! 2.0 NBS could include additional columns to handle non-normalized columns.

However, there are some factors related to internationalism and data versioning in 2.0 that might require a higher degree of normalization than we have now. For example, a single content_id might be associated with multiple content items written in different languages and there could be multiple versions of those content items for each language instance.

Now, let's dive a little deeper because this meets one of most plaguing issues (IMHO) with Joomla!:
Autoincremented fields used for identifiers.

Imagine to get rid of autoincrement at IDs due to replacement with IDs being calculated by Joomla! for itself.
Build an index over IDs where dupes are allowed. In such a case you only need an identifying field for language used.

Try it for yourself: It will take very much less resources, will be more flexible and has no negative impact wherever you want to point to. Even a consistent menu / sitemap / pathway without additional entries having different IDs but same content is possible than. Additional effort at backend can not be a decision base because resources have to be saved in frontend where real load occures.

What do you want more?

cu, diri

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Last edited by diri on Mon Aug 07, 2006 8:15 am, edited 1 time in total.

Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Fri Aug 11, 2006 2:43 am 
Joomla! Intern
Joomla! Intern
Offline

Joined: Sat Dec 31, 2005 7:22 pm
Posts: 61
Diri... I will post a partial schema soon so you can see what I'm working on for my 1.5 project. My project is a standalone application based on the Joomla! 1.5 framework. I cannot use the existing core component set and am replacing it with a new application-specific component set. These components use a schema based on nested sets and a registry.

I don't use the Fish. Instead, I'm using an alternate i18n/l10n library that supports Unicode UTF-8 using native PHP with some custom extensions to support multilingualism (m17n) and globalization (g11n).

I'm currently using ADOdb for database server abstraction with and without the use of the Active Record pattern. For MySQL, I'm using the InnoDB engine. I'm also testing my code on several database servers.

Above, I think you are talking about treating content_id and language_id as a unique key. If so, you have a general idea of what I'm trying to do with the exception of a third 'version_id' column that is used to handle content versioning and content production workflow. The content_id column is autoincremented in my table. However, I implemented nested sets using a different approach than the solution currently planned for the NBS. This approach in combination with a registry minimizes the need to normalize tables, eliminates some table redundance in the existing 1.5 schema, and eliminates the need for some core components.


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Fri Aug 11, 2006 7:59 am 
Joomla! Guru
Joomla! Guru
Offline

Joined: Wed Sep 21, 2005 9:27 am
Posts: 691
Location: Somewhere
mmx wrote:
Diri... I will post a partial schema soon so you can see what I'm working on for my 1.5 project.


I'm very curious about it! :)

mmx wrote:
These components use a schema based on nested sets and a registry.


Currently I try to introduce something similar in 1.0.x ...
mmx wrote:
I don't use the Fish.

I don't use it either, your solution is similar to my approach. IMHO to much overhead and error prune due to current application scheme.
mmx wrote:
I'm currently using ADOdb for database server abstraction ... . I'm also testing my code on several database servers.

I see, we are thinking in similar direction.
mmx wrote:
Above, I think you are talking about treating content_id and language_id as a unique key.

Excuse my bad English, please. I try to rephrase it.

A content item in multilanguage capable environment is identified by two IDs:
Item ID and language.

Item ID for one content item is always same, only language identifier differs. Therefore item's ID can not be unique while combination of both is. For me an item ID is a pointer to an item with a can of possible variants (language in this case).

Version ID could be an additional field which would make it possible to have different versions (maybe caused by later editing while translation needs some time) online at the very same time.

Yes, I blow those tables up to get speed. :) It's not the style being told in schools, it's result of having my roots at a time when one had to pay for machine time and CPU cycles. Carefull developed SQL will lower load and increase speed doing it this way.

mmx wrote:
This approach in combination with a registry minimizes the need to normalize tables, eliminates some table redundance in the existing 1.5 schema, and eliminates the need for some core components.


That's correct. You might want to have a look at w-agora at sourceforge which has kind of similar basics. It' a forum software where each forum has it's own settings in simple text files. IMHO a very interesting approach and, after a long while of not seeing something new, it's developed further on.

cu, diri

ps: Had to correct phrasing for better understanding of meaning.

_________________
Don't confuse me with facts. Read
http://www.heise.de/security/Massenhack ... from/rss09


Last edited by diri on Fri Aug 11, 2006 8:37 am, edited 1 time in total.

Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Thu Aug 31, 2006 1:12 am 
User avatar
Joomla! Explorer
Joomla! Explorer
Offline

Joined: Mon Dec 12, 2005 5:34 pm
Posts: 410
AmyStephen wrote:
I can't seem to find DBDesigner's roadmap -- they have apparently shut down their forum due to "attacks". They say there are 3rd party plugins. Do you know if there are plugins for other DBMS?

Hi,

for the records: DBDesigner's dead as reported in some interview I found a couple of months ago on the MySQL website.
The developer(s) joined the "MySQL GUI-Tools" crew.
The successor of DBDesigner is called "MySQL Workbench" and it's available for download from the company's website. Currently working best with (and designed for) MySQL 5, but still far from what one can do with DBDesigner and MySQL 4.0.x It can read the DBDesigner XML file though and can be scripted to automate DB changes and other impressive looking goodies like some "shell" I'm too dumb to grasp  ???
http://dev.mysql.com/downloads/gui-tools/5.0.html

There's also an "editon" for MySQL 4.x with each app as a separate application, lower version numbers, hard to "see" differences in some of them except for the CREATE sql statements. My favorite is Administrator and QueryBrowser.
WorkBench for v4.0 is of no practical use, no features, and more of a prototype than a useable app. I supposed it's canned in favor of pushing the 5.0 Tools Suite, since there were no updates on the 4.x series for a while.

Hope this helps.

Have fun,
CirTap

_________________
You can have programs written fast, well, and cheap, but you only get to pick 2 ...

"I love deadlines. I like the whooshing sound they make as they fly by." Douglas Adams


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Thu Aug 31, 2006 1:30 am 
Thanks! Will look into it. MySQL has come a long ways over the years. Appreciate the info!

Amy :)


Top
   
 
Posted: Thu Oct 05, 2006 1:56 pm 
User avatar
Joomla! Intern
Joomla! Intern
Offline

Joined: Fri Jun 23, 2006 4:41 pm
Posts: 84
Location: Joinville - Santa Catarina - Brazil
torkil,

Thank you for making this DB Schema!
Could you please export a XML file and post it here? It would help me a bit whenever I need to make a new component!

Best regards,

Doug

_________________
Douglas Machado :: Joomla Extension Developer :: Brazilian : [<o>]
PHP Cross Reference for Joomla 1.5: http://ideal.fok.com.br/documentation/joomla-1.5/


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Thu Oct 26, 2006 11:26 am 
Joomla! Fledgling
Joomla! Fledgling
Offline

Joined: Fri Oct 13, 2006 12:17 pm
Posts: 3
Hello anybody, I find this scheme very usefull but  I'm just searching something a bit different... I mean that I need the e/r scheme for the joomla db, could anyone help me?


Top
   
 
 Post subject: Re: 1.5 DB schema
Posted: Fri Feb 02, 2007 4:02 pm 
User avatar
Joomla! Intern
Joomla! Intern
Offline

Joined: Fri Jun 23, 2006 4:41 pm
Posts: 84
Location: Joinville - Santa Catarina - Brazil
blaco wrote:
Hello anybody, I find this scheme very usefull but  I'm just searching something a bit different... I mean that I need the e/r scheme for the joomla db, could anyone help me?


Blaco,

With DBdesigner you can actually make your own e/r schema, however it would be much easier if someone could attach the DBDesigner XML file for the Joomla! scheme.

I would like to have that too!

Best regards,

_________________
Douglas Machado :: Joomla Extension Developer :: Brazilian : [<o>]
PHP Cross Reference for Joomla 1.5: http://ideal.fok.com.br/documentation/joomla-1.5/


Top
  E-mail  
 
 Post subject: Re: 1.5 DB schema
Posted: Tue Mar 13, 2007 2:02 pm 
User avatar
Joomla! Apprentice
Joomla! Apprentice
Offline

Joined: Wed Jan 25, 2006 3:21 pm
Posts: 43
I have just been looking for documentation on the Joomla1.5B1 database schema and found the HTML version, nice work torkill .. But There are no comments for the fields in each table. Is there somewhere that i can find this info ?

thanks,
Mike :)


Top
   
 
Display posts from previous:  Sort by  
Post new topic Reply to topic  [ 120 posts ]  Go to page Previous  1, 2, 3, 4  Next

Quick reply

 



Who is online

Users browsing this forum: Google Adsense [Bot] and 25 guests


You cannot post new topics in this forum
You cannot reply to topics in this forum
You cannot edit your posts in this forum
You cannot delete your posts in this forum
You cannot post attachments in this forum

Jump to:  
Powered by phpBB © 2000, 2002, 2005, 2007 phpBB Group