- PeterMoulding.com
- Author
- Trainer
- Speaker
- Business Coach
- How to write a How To book
- PHP Courses
- Speaking
- Web Architect
- Australia
- Books
- Authors
- Akkana Peck
- Alex Berenson
- Andrew Nugent
- Ben Sanders
- Brock Clarke
- Chris Simms
- David Mercer
- Dianna Mullet
- Don Winslow
- Dori Smith
- Harlan Coben
- Jack McDevitt
- James Wines
- Jerry Yudelson
- John Grisham
- Kevin Mullet
- L. E. Modesitt Jr.
- Laurell K. Hamilton
- Marshall Karp
- Martina Cole
- Michael Marshall Smith
- Michel Roux Jr
- Nadia Sawalha
- Philip Pullman
- Raymond Khoury
- Richard North Patterson
- Robert Masello
- Sally Roth
- Sarah Langan
- Stella Rimington
- Stephen Booth
- Stephen King
- Stephen Leather
- T.C. Boyle
- Tom Negrino
- Tony Hillerman
- Urban Waite
- Val McDermid
- Valerio Massimo Manfredi
- Beginning GIMP
- Beginning Visual C++
- Culturalism
- Fiction
- A Drink Before The War
- A Talent for War
- Bag of Bones
- Blood and Ice
- Burn
- Dark Lady
- Dead Line
- Eclipse
- Empress of Eternity
- Exley
- Flipping Out
- Just One Look
- Nightfall
- Pet Sematary
- Savage Moon
- Skinwalkers
- Starvation Lake
- The Fallen
- The Gardens of the Dead
- The Jump
- The Last Templar
- The Mermaids Singing
- The Midnight Mayor
- The Secret Soldier
- The Summons
- The Terror of Living
- The Testament
- The Tower
- Under the Dome
- Virus
- AJAX and PHP
- Aging with Grace
- Food books
- Green Architecture
- Life Is So Good
- SQL: The Complete Reference
- The Backyard Bird Lover's Ultimate How-to Guide
- The Garden Gurus
- Authors
- Sustainability
- -18 hours left to decide the future of Australia
- Campbells vegetable stock or Massel vegetable stock?
- Carbon Sequestration
- Carbon tax for Australia is a fraud
- Copenhagen will fail
- Cost of living in Australia
- Dick Smith jumps on the population bandwagon
- Dry Run: Preventing the Next Urban Water Crisis
- Energy Saving Lights
- Garlic
- How many people can live in Australia?
- Its obsolete, throw it out!
- Julia Gillard offers 9.9 billion dollars bribe to Rob Oakeshott
- Laundry detergent
- Petrol or Diesel?
- Reflective foil batts kill
- RoHS
- Sea level to rise 3mm due to climate change
- Solar power
- Spring again in Sydney
- Sustainable fuels
- The CRUD Tax is back
- The people who make building regulations do not own houses
- Water efficiency
- Which insulation is safer, foil or wool?
- Will Australia reduce greenhouse gas emissions?
- Technology
- Android or Blackberry or iPhone or a flip phone?
- Apple versus Google 2011
- Cameras
- Cars
- Colour
- Burgundy
- Colour Blindness
- Colour Names
- Dulux colours
- Pantone colours
- Safe Colours
- Seculine ProDisk Mini colour balance card
- What Causes Colour Blindness?
- Hardware
- Batteries for the Digital Age
- Cables
- Cases
- Computer reliability
- Computrace
- Disks
- Astone ISO Gear 481E
- Best SSD for your notebook computer
- Disk block size
- Hitachi disk HDS722020ALA330
- LaCie USB 2.0 250 GB mobile hard drive design by F.A. Porsche
- SMART disk
- Samsung 2 TB HD204UI quiet low power disk for mass storage
- Seagate and Samsung merge disk business
- Select the right disk for your RAID array
- USB disk speed
- Western Digital WD20EARX 2 GB SATA 3 disk
- How long should computer hardware last?
- Keyboards
- Mainframe
- Memory cards
- Monitors
- Netbooks, notebooks, tablets, and xPads
- Network Attached Storage
- OLED Displays
- PC's are a thing of the past
- Printers
- Quiet
- Samsung Galaxy S
- Speed
- Television
- Tools
- USB
- Worst computer movies
- Xserve is dead. What next?
- Your backup will not work
- Z68 motherboards
- iPad or Acer Aspire One?
- IQ
- LG Intello Washing Machine
- Lack of a challenge
- Networks
- 802.11n wireless networking
- D-Link DIR-655 wireless router
- D-Link DWA-160 Xtreme N dual band USB adapter
- D-Link DWA-556 Xtreme N PCI Express desktop adapter
- MIMO
- NBN spends another $12 billion of our tax money on nothing
- National Broadband Network
- Netgear wireless modem router DGND3300 with 300 Mbps 802.11n
- Refrigerator kills wireless broadband
- Small Wireless Network
- TP-LINK TL-SG10005D 5 port gigabit switch
- TP-Link TL-WR1043N wireless N gigabit router
- Telstra Pre-paid Mobile Wi-Fi
- Where are the router plus proxy server combinations?
- Open Source documentation
- Software
- 7-zip
- Accounting
- Asterisk
- Audacity
- Backup software
- Bloat only in Windows
- CAD
- CDex
- Disk imaging software for copying and backup
- Exact Audio Copy
- Filezilla
- Firefox
- Java
- LibreOffice or OpenOffice?
- Linux
- 1 in 5 servers will ship with Linux
- Android phones outsell iPhone
- Another Move to Linux
- CentOS 5.5 installation on SSD and RAID 5
- Debian
- Debian 5.0.5 AMD64 installation
- Debian 5.06 installation
- Fedora
- Fedora or Ubuntu?
- Gnome or KDE?
- K9copy
- Linux 2.6.38
- Linux Gnome login settings lost
- Linux Mint
- Linux RAID, a rant
- Linux Speed
- Linux Time
- Linux reliability as demonstrated by Ubuntu 10.10
- Linux reliability as demonstrated by Ubuntu 11.4
- Linux still a struggle in 2011
- Linux workstation disk RAID 1
- Linux, NT, Windows, and SETI
- Linux, three years of progress
- London Stock Exchange switches to Linux
- Mandrake Linux 9.2
- The partition is misaligned by 48128 bytes - warning from Linux RAID
- Ubuntu
- How to fix the scroll bars in Ubuntu 11.4 Gnome
- Kubuntu 10.10 alternate installation on desktop with RAID 1
- POWbuntu
- Ubuntu 10.10 after 6 months use
- Ubuntu 10.10 alternate installation
- Ubuntu 10.10 desktop RAID 1
- Ubuntu 10.10 desktop RAID 5
- Ubuntu 10.10 desktop install on a netbook
- Ubuntu 10.10 desktop installation
- Ubuntu 10.10 netbook install on a netbook
- Ubuntu 10.10 server AMD64
- Ubuntu 10.10 upgrade to version 11.4 beta 2
- Ubuntu 10.4
- Ubuntu 11.10
- Ubuntu 11.10 first upgrade
- Ubuntu 11.4 after one month use
- Ubuntu 12.04 beta1 desktop amd64
- Ubuntu One
- Ubuntu by Microsoft?
- Ubuntu desktop upgrade 10.4 to 10.10 failed because I did not check the media
- Ubuntu strikes again
- Upgrade Ubuntu to Linux Mint 12 LDXE for extra speed
- Yes, use Linux but not that distribution!
- Nero
- OpenOffice
- OpenOffice is now Apache Office
- Project management
- Scribus
- Software for Windows and Linux
- Text editors
- Time
- Todo applications
- Tomboy notes
- Top text editors
- Version control
- VideoLAN VLC media player
- Visio
- Webmin
- Webmin installation on CentOS for Web development
- Webmin installation on Ubuntu
- What is the most popular open source software today?
- Windows
- Another Windows person goes Linux
- BAD_POOL_CALLER
- Cygwin
- Microsoft Malicious Software Removal Tool cannot find a common virus
- One of the developers of Windows XP is criminally insane
- There are unused icons on your desktop
- W32time
- Which Windows version?
- Windows 7 Home Premium
- Windows XP Stop 0x0000007B during installation
- Windows XP is a disaster
- Windows processes
- XML
- Zip, bzip, gzip, or 7zip?
- configFree
- Technology Succession Planning
- VoIP
- Web Sites
- Drupal
- Do Drupal themes have to use the GPL?
- Drupal 7
- A better search facility for Drupal
- Drupal - performance or flexibility
- Drupal 7 Fields are hard to fix
- Drupal 7 new features
- Drupal 7 ships on January 5
- Drupal 7.14
- Drupal 7.4 hits PeterMoulding.com
- Drupal function sequence
- The evolution of a module
- Undefined index: headers in DefaultMailSystem->mail() (line 54 of /modules/system/system.mail.inc).
- Undefined index: to in DefaultMailSystem->mail() (line 83 of /modules/system/system.mail.inc).
- implode(): Invalid arguments passed in DefaultMailSystem->format() (line 23 of /modules/system/system.mail.inc).
- Drupal 8
- Drupal Code Load Cut
- Drupal How To
- Drupal Modules
- Backup and Migrate
- Browscap
- CKEditor with Drupal WYSIWYG
- Captcha
- Cel
- Colorbox
- Content Construction Kit
- Content type
- Devel module for Drupal
- Drupal Rules as an automation language
- Drupal Spam add-on module
- Form alter to node
- IMCE
- IMCE Wysiwyg bridge
- ImageAPI
- Jdog
- Lightbox2
- Module variable
- Node Gallery Access
- Node_Gallery
- Path
- Path redirect
- Pathauto
- Pet
- Search
- Service links
- Session Variable
- Statistics
- Taxonomy
- Token
- Token ex
- Transliteration
- Trigger
- Watch
- Other modules
- Drupal Training
- Drupal access controls need a major rewrite
- Drupal coding tricks
- Drupal performance
- Drupal themes for the future
- Drupal.org colours
- Import existing data into Drupal
- Multiple Web sites made easy using Drupal multisite and the right start
- drupal_lookup_path()
- Adobe PDF
- Apache
- Apache Mahout
- Audi.com
- Bleet
- CSS Strikes Again
- CSS or xCSS
- Can you believe Facebook or email?
- Content Management Systems
- Databases
- Facebook scam
- Font
- Fonts
- HTML
- Install Apache, MySQL, and PHP 5 in Ubuntu 11.4 using the Ubuntu Software Centre
- Language Codes
- Marketing
- Memcache
- Nginx
- Open source development hits another roadblock
- Oscars
- PHP
- SPDY
- Search software
- Techoni.com.au
- Theme themes
- Things to hate on Web sites
- U.S. Patent No. 6,985,875
- Virtual Private Server
- Visible Improvement
- Web 4.0
- Web browser usage
- Web browsers
- Web site development
- Bluefish
- Crying over spilt code
- Eclipse and PHP
- Getting a Git client, a story of ancient technology and pain
- HTTrack
- MVC
- Netbeans
- PHP or ..., CakePHP/Symfony/ZF versus ...
- Programming
- Superfish
- Web browser emulators for testing your Web site
- Web development frameworks
- Web site books
- Web site development on your own computer
- Webmin or phpMyAdmin or cPanel for creating databases?
- aiki framework
- jQuery
- Views development - Learn Fields first
- Views development - Learn Actions and Rules
- jQuery .each()
- jQuery .has()
- jQuery .is()
- jQuery and Firefox Firebug
- jQuery children
- jQuery for people not using Drupal - Installation and getting started
- jQuery hover
- jQuery hover de-duplication example
- jQuery or CSS?
- jQuery performance
- jQuery tests
- Web site hosting
- Westpac Web site still broken after two years and ten months
- Wordpress wins another CMS survey
- Drupal
NoSQL
Submitted by Peter on Fri, 2010-10-15 06:37
NoSQL is general name for some alternatives to SQL based databases and was a big Information Technology fashion hit back in 2009 but is now fading fast due to problems and a lack of benefit.
NoSQL is not NoSQL
There is a relational database named NoSQL, from http://www.strozzi.it/cgi-bin/CSA/tw7/I/en_US/nosql/, that is nothing to to with the NoSQL movement. The NoSQL database is designed to store data in Unix text files for access from Unix utilities. You have to read the whole file every time you look for some rows in a file. The only real use for this type of data storage is to save program settings. It quickly becomes grossly inadequate for anything larger. A better approach for long running applications is SQLite, as used by Thunderbird.
The top two
Cassandra and BigTable are the two main NoSQL databases. Cassandra is an Apache Foundation backed project based on open source code originally supplied by Facebook. BigTable is software used by Google for some of their data and is available only as a service, not as open source software.
Faster?
The main benefit claimed for NoSQL style databases is speed when used for storing large amounts of data. Nobody is seeing improved speed when NoSQL databases are used the same way as traditional SQL databases.
The main way to speed up NoSQL databases is to remove the requirement to make the data on disk current. You then run the risk of loosing data, money, and customers when your hardware, infrastructure, power, or network fail. In fact you will loose stuff but you will not know what you lost, including knowing if you lost something. You then have to add code to your applications to do the special things SQL databases do and that makes your NoSQL database as slow or slower than your SQL databases.
Some data is not important. If Google lose a few Web sites from a search, they will pick them up next time. Google have lots of flexibility. Clearly you do not want to lose financial transactions. You might not be so worried about user ratings of content or multiple images of a product. If someone visits your shop and sees four pictures of a product instead of five, or finds 11 reviews instead of 15, does it matter? In most cases no. You might have some frequently accessed data with lots of repetition where the lost of a small percentage does not matter. Showing 11 reviews in one second could be more important that showing the whole 15 over five seconds.
Bigger?
SQL databases offer special ways to handle a variety of big problems including long lists of data, large data items, and large inserts of new data. You may have to shop around the various database brands to find the right combinations for your requirements. MySQL has a basic free version and an enterprise version with extra features. PostgreSQL does similar things to the MySQL enterprise version. Oracle has more versions than I could cover in one page and now owns MySQL.
Consider a simple SQL expansion. Your database becomes too big for one disk so you move some tables to another disk. Some databases do not let you spread tables over several disks because all the tables are in one file. MySQL and some other databases use a different file for each table, making the split easy.
Now you have a table that is too big for one disk. MySQL, PostgreSQL, and others let you split tables into ranges, or segments, or partitions, they vary the name for split tables, and you place one range/segment/partition per disk.
A disk can be a huge storage device created by joining multiple disks together in one RAID array. The current large disk size is 2 TB, 2 TeraBytes, because of a silly software limitation, and 3.5 TB disks are ready to use when computer hardware manufacturers get their hardware up to date. You can get RAID array servers for 32 disks. 32 disks in a RAID6 array is 30 times 3.5 TB or 105 TB, more than you will need for a long time.
When you do need more than 105 TB, it will be all those video files you are storing. You can store those video files in a database without storing them in the database because the major databases have facilities to store large data items as separate files and those separate files can be on different storage devices, giving you the option of using more than 105 TB.
Lots of video
You could also write your application to store the big files as separate files and put only the registration information in your database, giving you more flexibility but without losing the advantages of an SQL database. Storing the whole contents of big files in your database is only an advantage if the database can search the content of the big files, something you cannot currently do with video.
The NoSQL alternatives give you no advantages for storing video. They cannot search video, making the storage in a NoSQL database as useless as storing the video in an SQL database. Storing videos as separate files then registering the video in a NoSQL database gives you no advantages over registering the videos in an SQL database.
NoSQL does nothing for video.
Huge text files
Huge text files look similar to huge video files until you realise you can search text. If your huge text files are inserted into your database as data fields, your database software can search all the text. PDF and other document formats can be searched the same as text. NoSQL databases give you no advantages when you perform the search.
When you use an external search facility, the facility usually needs the data as separate files. You are then back to storing your text/document files the same way as video and NoSQL has no advantages.
Google size sites
Google stores a lot of data and developed their BigTable for one main data storage task. They use SQLite for their regular data storage. There are less than ten Web sites with similar data storage problems. There are lots of databases outside the Web storing huge amounts of data and most use regular SQL. Wait until you are making a billion dollars per year before investing in NoSQL.
Hierarchical storage
The few huge non Web databases not using SQL are not using the common NoSQL projects, they are using data specific hierarchical storage systems. Hierarchical storage was around before NoSQL, in fact it was used before SQL. Some XML databases are based on the old hierarchical storage systems. SQL databases were developed to solve the problems caused by hierarchical storage systems.
Some NoSQL software uses some aspects of hierarchical storage.
Unknown problems
NoSQL is full of unknown problems plus known problems that are rarely mentioned. When you finish reading about all the known problems of existing NoSQL products, you will decide to leave the conversion from SQL to NoSQL until you are generating far more profit to cover the extensive conversion cost and the massive testing phase.
The unknown problems are caused by the lack of people using NoSQL. There are few chances of someone having exactly the same data as you coupled with exactly the same update requirements. Your problems have occurred before with the major SQL databases and people have found solutions. You will not get the same support with NoSQL, leaving you with the massive cost of developing your own support team and massive testing system.
Terminology
NoSQL products fall into three groups. The rest of the NoSQL products are SQL databases with a NoSQL interface added or SQL databases with some features switched off.
Column families
NoSQL products based on column families
provide the table part of an SQL database without the SQL interpreter. You go back to the early 1980s and some primitive databases before they developed SQL support. Your programmers have to convert requests from logical requests to detailed mechanical code.
Document Stores
Document stores were around in the 1970s and came back in with the initial popularity of XML. Content addressable storage was another flavour. Some document stores let you add an SQL database on top so you can find documents. Most of the major SQL databases handle document storage without requiring anything separate to the SQL database. Document storage is nothing to do with NoSQL.
Key Value / Tuple Store
This is the real NoSQL. Each table is equivalent to one column in an SQL database. Reading an item with several attributes requires gathering data from many tables, all by hand. A table with 50 columns and 5 indexes will be replaced by 50 tuple stores and one or more tuple stores for every index. If the tuple store does not have multilevel indexing for performance, you have to build your own multilevel index using multiple tuple stores. A simple three key index might blow out to 20 tuple stores and need more tuple stores added every time you expand.
Public opinion
TechReplublic says NoSQL expands transparently and they’re usually designed with low-cost commodity hardware in mind
. All the major SQL databases run happily on low-cost commodity hardware. Most of the major SQL databases expand as transparently as the available NoSQL products. They suggest you can do away with your DBA, DataBase Administrator, when you use NoSQL then point out you still need someone to perform the same work. The do not mention that you need some extra programmers to implement in your code all the things missing in NoSQL. They list five advantages for NoSQL but most of them are not true and some are only true when you work with the rare exception of unreliable data. They list five serious problems with NoSQL and most of them distill down to the need for many expensive experienced highly skilled people you do not need with SQL.
Quite a few people with experience point out that Google, the developers of BigTable, run all their mission critical data on SQL databases. Most of their data is in SQL databases. The only stuff that is not in SQL are a few of the tables behind their search.
Focus
Successful Web site owners know their success comes from creating a better product, service, or Web site, not reinventing the software behind the Web site. If your Web site becomes big enough to require something special, you will get the best results quicker by contributing to an existing open source SQL database. A small change to MySQL is far easier than a hundred percent reinvention. There are more people with the right experience. There are more people ready to invest in your improvement because they are working on something similar. Everything is easier.
Conclusion
If you are not in the top ten Web sites, the cost of switching from SQL to NoSQL is greater than the advantages. The unknown problems of NoSQL will kill you long before you outgrow SQL databases. Google and Facebook have the resources to support a NoSQL approach for the biggest database table in their main applications.








