7.0 Implementing a database design
Data values in each
column must be valid. Can be enforced by Primary and Foreign
Keys, Default definitions and Check constraints.
Each row in a given
table must be unique in the entity. Can be enforced by primary
key which by default creates a clustered unique index on the
column(s) that are covered by the primary key.
between tables through Primary Key and Foreign Key.
table has columns that define similar attributes and no column
contains multiple values in a single row.
"The key, the whole key and nothing but the key". Each column
that is not part of the primary key should depend on all
columns that make a composite Primary key and not only depend
on a subset of the composite Primary key.
that are not covered by the Primary Key should not depend on
Data that is irrelevant to a
Data of which multiple copies are stored
in the database.
Data that is derived from existing data.
Tables and indexes should be
distributed across as many different physical disks as
possible. Placing the transaction log on its own disk also
improves performance, because SQL writes data to the
transaction log sequentially.
Tables that can (and are
likely to) participate in joins with each other should be
placed on different disks so that they can be scanned in
functions can not be used in Where clauses because aggregate
functions are calculated from the result set and the Where
clause produces the result set.
can not be used. All columns in the Select list must occur in
the Group clause, but can have additional columns. Returns a
aggregate value for every row in the result set. Will only
return one summary row for every specified group when used
without ROLLUP or CUBE.
the columns in the having clause must be listed in the group
by clause or used in aggregate functions.
columns in the Compute by clause must appear in the order by
When compute is used ORDER BY
must also be used. Can not contain aliases. All columns that
are present in the compute by clause must be listed in order
by. Compute returns both detail information and sum values as
separate result sets. Without the BY, you will get one row for
every column that is specified in the compute clause. When BY
is used then you will receive a sum row for every group that
is specified after BY word.
To summarise the
differences between COMPUTE and GROUP BY:
produces a single result set. There is one row for each group
containing only the grouping columns and aggregate functions
showing the subaggregate for that group. The select list can
contain only the grouping columns and aggregate functions.
It produces multiple result sets. One type of result set contains
the detail rows for each group containing the expressions from
the select list. The other type of result set contains the
subaggregate for a group, or the total aggregate for the
SELECT statement. The select list can contain expressions
other than the grouping columns or aggregate functions. The
aggregate functions are specified in the COMPUTE clause, not
in the select list.
Uses a comparison operator to
match rows from two columns based on the values that the
columns have in common. Only returns rows that match.
Returns all rows from at least
one of the tables or views that are specified in the From
Left Outer Joins:
Returns all rows from the left table regardless of if there
is a match in the right table. The right table will return
NULL where there is no match.
Right Outer Join:
Like Left outer joins, but the other way around.
Full Outer Joins:
Returns all rows from both tables. Where a row in either
table has no match, NULL values will fill the fields that
have no match.
All rows in the left table
are matched with all rows in the right table. The result of a
Cross join is a Cartesian product. A Cartesian product returns
all possible combinations of rows from every table that is
involved in the join(s) operations. The amount of rows that
will be returned are from two tables are the number of rows in
the first table * the number of rows in the second. A CROSS
JOIN with a WHERE clause works like an INNER JOIN.
The where clause is
executed after the join clause which might not give you the
results that you want. For example if you want to return all
the rows from one table with a left join and use a where
clause, this will limit your result set to only show the rows
that match the criteria of the where clause.
Processes all chosen values in one column and
produces a result value. SUM,AVG,COUNT,MAX,MIN,COUNT(*)...
Let you combine the result sets from two or more SELECTs into
one result set. The result sets must have the same number of
columns and have compatible data types.
Must fetch data rows in a serial
pattern from the result set. FETCHNEXT is the only operation
allowed. Can modify data.
FORWARD-ONLY and READ-ONLY
can be fetched from anywhere in the result set. Is often use
in OLTP applications where the cursor is mapped to a grid or a
list box. When the user scrolls up and down in the grid,
scroll fetches are made to retrieve the data rows from the
When this option is set to ON, an
overflow or divide-by-zero will terminate the query or batch.
If this option is set to OFF, a warning message will be
displayed but the query will complete.
When set to ON: If NULL values are present in
aggregates an error message is generated. Divide-by-zero and
arthimetric overflow will cause the statement to be rolled
back and return an error. When this is set to OFF, no error
messages are generated.
When this option
is set to ON an error will be generated when a loss of
precision occurs (ex. decimals). When this is set to OFF, no
errors are reported, and the result will be rounded to the
precision on the columns or variables in that result.
An attribute that allows NULL values
is called NULLABLE
When this option is set
to ON, a comparison of an equal(=) and not equal(<>)
will always return NULL if any of the arguments contain a
NULL. If this option is set to OFF then True or False will be
returned depending on whether both of the arguments are NULL,
or only one of the arguments contains NULL.
When this is set to ON, new columns created
with the ALTER TABLE and CREATE TABLE statements allow null
values if no nullability status of the column is not
explicitly specified. Has no effect on columns created with an
explicit NULL or NOT NULL. SET
When this option is set to ON new columns that
are created using the ALTER TABLE and CREATE TABLE statements
are by default NOT NULL. Has no effect on columns created with
an explicit NULL or NOT NULL.
Fractions cannot be
specified in the SIZE, MAXSIZE, and FILEGROWTH parameters. To
specify a fraction of a megabyte in the size parameters,
convert to kilobytes by multiplying the number by 1024. For
example, specify 1536 KB instead of 1.5 MB (1.5 times 1024
Occurs when page
density is low. Page density refers to how full, or dense, a
page is. Lower page density equates to more I/O's when
performing a SELECT statement. In SQL 7.0, a page is 8K. The
maximum amount of data which can be contained in a single row
is 8060 bytes, not including text, ntext and image data. Let's
say that you have a SQL 7.0 table with a row size of 4,040
bytes. Only one row will fit on a page in this scenario.
However, if you were able to reduce the row size to 4,030
bytes, then two rows will fit on one page. This would result
in half the number of I/O's per SELECT statement, making a
much more efficient table design. Internal fragmentation can
be evaluated by looking at the "Avg. Page Density
(full)" line on DBCC SHOWCONTIG output. As a general
rule, it should be greater than 90%.
Occurs when extents are not
contiguous. Space is allocated to tables and indexes in
extents. An extent is 8 pages. So, in SQL 7.0, an extent is
64K. When extents are out of order on the disk, this will
result in less than optimal data access. It's basically the
same philosophy as disk fragmentation. External fragmentation
can be evaluated by looking at the "Scan Density [Best
Count:Actual Count] line on DBCC SHOWCONTIG output. This value
should be 100.00%
must be assigned to a local variable and the keyword OUTPUT
must be used.
Stored procedure that executes
Against every table in the current database.
UPDATE STATISTICS: against one table that you specify.
A RAID is seen in
PerfMon. as one physical disk.
NOT FOR REPLICATION PRIMARY KEY
When a replication agent replicates a row to a Subscriber, the
identity value is not changed when the row is inserted in the
Subscriber's table. The seed value at the subscriber is not
affected and will increment with its own seed and not start
incrementing beginning from the newly inserted seed value. If
you are using transactional replication with the
Immediate-updating Subscribers option, do not use the IDENTITY
NOT FOR REPLICATION design. Instead, create the IDENTITY
property at the Publisher only, and have the Subscriber use
just the base data type (for example, int). Then, the next
identity value is always generated at the Publisher.
uses the newid() function generates a 16bit datatype random
number with no other meaning.
To improve the
performance of DSS systems you can denormalize the database
and add preaggregated functions. Adding more indexes also
helps DSS systems. Normalizing only helps OLTP.
Setting the priority
too low for a connection makes the connection more likely to
be terminated if a deadlock situation occurs. There are only
two levels of priority: low and normal. There is no High
be used (instead of triggers) whenever possible. Triggers
first try to implement the changes and then roll back the
transaction if the change violates the specified criteria,
which will cause extra processing. Note that Check constraints
can not delete existing data and cannot reference other
tables. If constraints exist on the trigger table, they are
checked prior to a trigger execution. If either the primary
key or foreign key constraints are violated, the trigger is
If you do not use
the WHERE clause all rows in the table(s) must be read. You
should create indexes on columns that are often use in WHERE
A global NT group in
SQL = Domain\Group
A local NT group in SQL = BUILTIN\Group
can be placed in WHERE clauses when used in a subquery (using
parenthesis). When using the HAVING clause, all columns that
appear in the SELECT list or in the HAVING clause must either
be listed in the GROUP BY clause or be used in a aggregate
Shared locks are held while the data is
modified. Avoids dirty reads, but data can be changed before
the transaction has completed which can result in
non-repeatable reads or phantom data
Isolation level 0. No shared locks. Dirty reads,
nonrepeatable reads and phantom data can occur.
Shared locks on selected data. Prevents dirty
reads, nonrepeatable reads but does not prevent phantom data.
a range lock on the data that is specified which prevents
others from updating and inserting until the transaction has
Use row-level locks rather than the coarser-grained page- and
Use a table lock rather than using finer-grained row- or
page-level locks. SQL Server holds this lock until the end of
the statement. However, if you also specify HOLDLOCK, the lock
is held until the end of the transaction.
Use an exclusive lock on a table. This lock prevents others
from reading or updating the table and is held until the end
of the statement or transaction.
Use update locks instead of shared locks while reading a
table, and hold locks until the end of the statement or
transaction. UPDLOCK has the advantage of allowing you to read
data (without blocking other readers) and update it later with
the assurance that the data has not changed since you last
deadlocks: Use resources in the same sequence in all
transactions. Avoid user interaction in transactions. Keep
transactions short and in one batch. Use as low isolation
level as possible.
17-19 Hardware or software errors
Fatal.The client will be disconnected.
To be able to
restore the database to a moment in time it must be possible
to backup the active transaction log. This can only be done if
the log and the primary data files can be reached. You should
place the log and the primary data files on a mirror set.
specifies the maximum total numbers of decimal digits that can
be stored, both to the left and to the right of the decimal
point. Must be a value from 1 trough max precision (max 28
unless the server is started with the /p parameter of sqlservr,
then the max is 38
the maximum number of decimal digits that can be stored to the
right of the decimal point. Form 0 trough p. Default 0
: datepart = yy, increment number = 4, date = getdate()
start, length) select substring('Daniel', 1, 3) OUTPUT: Dan
NOTE: start can not be 0, then the output will return NULL
FREETEXT predicates are typically used in the WHERE clause of
FREETEXT(column | *, 'freetextstring')
CONTAINS(column | * '' can use 'searchword' OR 'searchword'.
FREETEXTTABLE functions can only be use in the FROM clause.
Generate a result table that return two columns, Key and Rank.
You create CHECK
CONSTRAINTS with ALTER TABLE on an existing table.
MERGE REPL. Does not
support the timestamp datatype.
A clustered index
should be kept as narrow as possible to reduce disk space and
to reduce I/O caused by modifying data in the table.
Nonclustered indexes use the clustered index keys as pointers
to data, and therefore when data is modified, it will affect
both the clustered and all nonclustered indexes. You should
create a clustered index on the Primary key column(s) and
create nonclustered indexes on columns that are often used in
WHERE clauses. In a composite index use the column that has
more unique values first.
text, and images can only be placed on specific filegroups and
not on specific files within a filegroup.
Should be used when you have a narrow table
with many rows
Should be used when you have a table
with a lot of columns, but only a few are queried on a regular
Everytime a BEGIN
TRANSACTION is found, the value of @@TRANCOUNT is incremented.
Is used to keep track of nested transactions. When a COMMIT
TRANSACTION is found and @@TRANSCOUNT = 1 the transaction is
committed and resources are freed. If @@TRANSCOUNT > 1 then
@@TRANSCOUNT will be decremented by one but no resources will
Will cause SQL to verify that values created by
an INSERT or UPDATE are within the value range that was
specified when the view was created. When this option is
chosen, an error message will be returned if the value is out
of range. sp_who: returns a list of all connections to the
server and the process identification of these processes=spid.
You will also get a column named blk(block) which indicates
the spid of another process that is blocking another process.
Used to force a process to terminate. If any changes have been
made they will be rolled back and the lock will be released
first when the transaction has rolled back.
Will complete the query on a remote
server. Must have a linked server set up (remote (old) server
will not do, it can only exe. stored procedures on a remote
server and not queries).
Contains all the information that it needs to connect to a
One row with particular primary key value in the parent table
corresponds to one row in the child table with the same value
as a foreign key.
One row with a particular primary key value in the parent
table corresponds to multiple rows in the child table with the
same value as a foreign key. eg: a customer (parent) may place
zero or more orders.
be accomplished by using two One-to-many relationships to a
A foreign key can
only implement a one-to-one or a many-to-one relationship.
primary key can only implement a one-to-one or at one-to-many
IMPLICIT_TRANSACTIONS is set to ON: if the user does not
explicitly commit or roll back the transaction (and all
modifications that the transaction has made), the transaction
will be rolled back when the user disconnects.
Sort: Order By
in a where clause