Behind virtually all websites and web applications, hides an SQL database. And when designing it, it is necessary to choose with accuracy the different types of columns. Something that sometimes proves more difficult than expected … What is the difference between one CHARand one VARCHAR? What are they BLOB? This small guide will avoid some errors that could well, ultimately cost us a few extra gigabytes and unnecessary slowdowns.

We will concentrate on MySQL because it is the DBopen source the most widespread, but either MySQL, PostgreSQL or another does not change much, given that they are all expected to respect the SQL standard – although ‘In practice, they each have some specificities – one will easily find in another DBMS.

Strings of characters

CHAR & VARCHAR

Many are unaware of the difference between the two as they allow both to store strings up to 255 bytes. The difference lies in the size needed to store the chain. It will CHAR(x)store on x bytes, so if you choose one CHAR(255)for your column, each field will take 255 bytes, even if you store one word. Conversely, VARCHARas the name implies, it has a variable size. It VARCHAR(x)stores it on x bytes maximum, so one VARCHAR(255)can weigh only 64 bytes.

In this case, why not always opt for the VARCHAR? It is a bit heavier for the same amount of data stored, since in addition to your string, it stores the size of it (since it is variable).

So we will use it CHARfor a string that is always of the same length, for example a client reference that always makes the same number of characters (sorry, I have no more example that comes to mind: P ) While one VARCHARwill be perfect for a pseudo.

Note: From MySQL 5.5, VARCHARcan be up to 65,535 bytes, the same size as TEXT.

TEXT & BLOB

Types TEXT, which range from TINYTEXTto LONGTEXT, are used to store character strings from 256 bytes and up to 2 ^ 32, or 4,294,967,296 bytes. For example, in WordPress, the content of each article is stored in one LONGTEXT, with that, one can therefore write novels!

The types BLOBare similar to TEXT, however, they store the data in binary and not in string. Two important points differentiate them from their counterparts TEXT:

  • As they store the strings in binary, it is possible to use several sets of characters indifferently, which is not possible with TEXTsince one defines a set for the column and one holds there.
  • Moreover, the type TEXTdoes not take into account case, so it is possible to compare a string uppercase to a lowercase, with BLOBthis is not possible since it will treat the code representing the signs of the chain, not the signs themselves.

L Corresponds to the size of the stored string, and the + x is the size in bytes necessary to store the string size.

TypeMaximum sizeSpace Required
TINYBLOBTINYTEXT(256)L + 1
BLOBTEXT(65,536)L + 2
MEDIUMBLOBMEDIUMTEXT2 24 (16 777 216)L + 3
LONGBLOBLONGTEXT2 ^ 32 (4 294 967 296)L + 4

The Numbers

The integers

Integer numbers are used with type columns INT, which range from TINYINTto BIGINT.

Column typeMaximum sizeSpace Required
TINYINTFrom -128 to 1271 byte
SMALLINTFrom -32768 to 327672 bytes
MEDIUMINTFrom -8388608 to 83886073 bytes
INTFrom -2147483648 to 21474836474 bytes
BIGINTFrom -9223372036854775808 to 92233720368547758078 bytes

Two attributes are particularly interesting with integers. It is about ZEROFILLand UNSIGNED. The first one adds 0 to the left of your number so that it reaches a predefined number of characters. The second allows “doubling” the capacity of a column.

  • It is possible to specify a minimum character size with INT. If the number entered does not reach the minimum number of digits specified, the attribute ZEROFILLwill add a 0 to the left. For example, to have at least 5 characters you can do INT(5) ZEROFILL, if you enter 8, then it will ZEROFILLcomplete it to save 00008.
  • The attribute UNSIGNED, on the other hand, allows to ignore the sign. We therefore keep the same storage interval, but by allowing only the positive, we double the maximum allowed number. For TINYINTexample, instead of having an interval between -128 and 127, the interval is between 0 and 255, which is very convenient in some cases!

The decimals

NUMERIC & DECIMAL

They are equivalent and can take two parameters (but this is not required), precision and scale.

  • The precision represents the number of significant digits (not the 0 on the left)
  • The scale gives the number of digits after the decimal point

For example, DECIMAL(6,3)may store: 5,11; 005.108; 123,108 … Attention, the precision includes the digits before AND after the comma. So with an accuracy of 6 and a scale of 3, you can store numbers with three digits before the decimal point and three digits after.

FLOATDOUBLE&REAL

FLOATCan be used without parameters, however in this case four additional bytes will be used for the value of the column. It is also possible to use it with precision and scale, in the same way as NUMERICand DECIMAL.

DOUBLEAnd are REALused without parameters. DOUBLEIs normally double REALin the SQL standard, ie there is an accuracy of 4 bytes for REALand 8 for DOUBLE. Nevertheless, MySQL addresses an accuracy of 8 in both cases, so much use DOUBLE. In this way, if we change DBMS, we will avoid surprises.

What is the difference between NUMERICDECIMALand FLOAT?

NUMERICAnd DECIMALstore the values ​​as a character string, while FLOAT, but also DOUBLEand REAL, store them as a number. It’s cool to know, but not great interest you will tell. Well if, since a comma number is never quite exact, so as a number, the values ​​are approximated.

That is to say that your 2,045 can be stored at its approximate value 2,045,0001. Which is not quite the same thing … and that may pose a problem if we have to make comparisons on it. On the other hand, in the form of a string, your numbers do not move. It is always good to know, depending on the applications, the difference can be important.

Hours and dates

DATETIMEDATETIME&YEAR

As their respective names indicate, these four types are used to store the date, time, time / date and year. They are stored in the following formats:

  • YYYY-MM-DD (but YYYYMMDD or YYMMDD also work)
  • HH: MM: SS (but also HHMMSS)
  • YYYY-MM-DD HH: MM: SS (but also YYYYMMDDHHMMSS)
  • YYYY

Warning, in the format where you do not specify the century, MYSQL will take the 20th century by default for dates after 70 while it will infer that you are talking about XXI if you have a date between 00 and 69 inclusive. Both provide an AAAA format to avoid errors;)

MySQL supports dates between 1001-01-01 and 9999-12-31, except for those YEARthat support an interval between 1901 and 2155, included (but which has the advantage of being very light because encoded on a single byte) .

TIMESTAMP

It is rarely used because it does not correspond to the TIMESTAMP UNIX (which is the one of the majority of programming languages), in which the TIMESTAMP corresponds to the number of seconds elapsed between the 1st January 1970 to OOh00m00s and the date in question, logically Expressed in seconds. MySQL actually stores them in the same format as the DATETIME, so impractical, for him to pass a TIMESTAMP generated by php, ruby ​​or other.

Multiple choice columns

ENUM, As the name indicates, allows to enumerate choices. It operates a little differently from the other columns because it must first inform the different values ​​that it can take. It is therefore assigned several choices by default. For example, let’s take a genre column in which we want to list several genres of cinematography:

Genre ENUM ( 'horror' , 'comedy' , 'thriller' , 'reporting' )

Here is an example of a column ENUM:

ValueIndex
NULLNULL
0
‘horror’1
‘comedy’2
‘thriller’3
‘report’4

This column may not contain values ​​other than those specified in the definition of ENUM, except:

  • NULL (If allowed when saving the column)
  • An empty string if one attempts to insert an invalid value.

The index of the empty string is 0. To insert a value, one can either directly submit the value in question, or provide its index as an argument. The type ENUMmay have 65535 different values. Its strength is that it really does not take place since it is coded on 1 byte!

SETIs similar to the fact ENUMthat a single line can take several values. For example, for the column below:

SET genre ( 'horror' , 'comedy' , 'thriller' , 'reportage' )

An entry can take as ‘horror’ or ‘thriller’ value, but also ‘horror, thriller’. To record the values, the operation is again very similar to that of ENUM, in the case where the value of this one is directly submitted. To give several values, you have to separate them by commas, for example, to have thriller and comedy in the same line we will do ‘thriller, comedy’.

It is also possible to use indexes to record values. We have in order horror, comedy, thriller and reporting. So a four digit number (since four values). If the number corresponding to the value is 1, it is recorded, if it is at 0, it is not recorded. A small picture would perhaps be welcome:

Presencehorrorcomedythrillerreport
present1111
Not Present0000

So 0000 will give an empty line, 0001 will give report, 1010 will give horror, thriller etc …

However, unlike ENUMSETcan only take 64 values.

Finally, be careful, SETand ENUMare specific to MySQL, if you use them, it will subsequently be difficult to change DBMS.

Author

Am a tech geek.. Do you wanna know more about me..? My contents will do tell you.

Pin It